BASIC BUSINESS STATISTICS 15TH EDITION BY MARK L BERENSON, DAVID M LEVINE, KATHRYN A SZABAT, DAVID F

Page 1

SOLUTIONS MANUAL FOR Christine Verity

BASIC B USINESS STATISTICS FIFTEENTH EDITION

Mark L. Berenson Montclair State University

David M. Levine Baruch College, City University of New York

Kathryn A. Szabat La Salle University

David F. Stephan Two Bridges Instructional Technology PPID: A103000240340

Copyright ©2024 Pearson Education, Inc.

v


Table of Contents

Teaching Tips................................................................................................................................................ 1

Chapter 1

Defining and Collecting Data............................................................................................... 43

Chapter 2

Tabular and Visual Summarization of Variables ................................................................. 51

Chapter 3

Numerical Descriptive Measures ....................................................................................... 155

Chapter 4

Basic Probability ................................................................................................................ 201

Chapter 5

Discrete Probability Distributions ...................................................................................... 211

Chapter 6

The Normal Distribution and Other Continuous Distributions .......................................... 245

Chapter 7

Sampling Distributions....................................................................................................... 283

Chapter 8

Confidence Interval Estimation.......................................................................................... 309

Chapter 9

Fundamentals of Hypothesis Testing: One-Sample Tests.................................................. 347

Chapter 10

Two-Sample Tests............................................................................................................. 397

Chapter 11

Analysis of Variance .......................................................................................................... 459 Copyright ©2024 Pearson Education, Inc.


Chapter 12

Chi-Square and Nonparametric Tests ................................................................................ 497

Chapter 13

Simple Linear Regression .................................................................................................. 543

Chapter 14

Introduction to Multiple Regression .................................................................................. 599

Chapter 15

More Complex Multiple Regression Models ..................................................................... 681

Chapter 16

Time-Series Forecasting..................................................................................................... 741

Chapter 17

Business Analytics ............................................................................................................. 869

Chapter 18

Getting Ready to Analyze Data in the Future .................................................................... 871

Chapter 19

Statistical Applications in Quality Management (Online) ................................................. 933

Chapter 20

Decision Making (Online).................................................................................................. 963

Online Sections ....................................................................................................................................... 1001

Instructional Tips and Solutions for Digital Cases ................................................................................. 1067

The Craybill Instrumentation Company Case ........................................................................................ 1107 The Mountain States Potato Company Case........................................................................................... 1109

The O. Hara Performance Consulting Case ........................................................................................... 1117 Copyright ©2024 Pearson Education, Inc.


The Sure Value Convenience Stores Case .............................................................................................. 1119

The Choice Is Yours/More Descriptive Choices Follow-up Case........................................................... 1127

The Claro Mountain State Student Surveys Case ................................................................................... 1183

The Shelter Bay Lifestyles Case .............................................................................................................. 1277

The Tri-Cities Times Case ...................................................................................................................... 1337

Chapter 1

1.1

(a) (b) (c)

1.2

(a) (b)

The menu items represent a categorical variable. Each menu item represents a separate category. The variable that contains the prices is a numerical variable. One menu item and the price of that item would be one instance or occurrence of data.

Business size represents a categorical variable because each size represents a particular category. The measurement scale is ordinal, because of the different sizes.

1.3

The variable speed trials is a continuous numerical variable because time can have any value from 0 to any reasonable unit of time.

1.4

(a) (b) (c)

(d)

The telephone number assigned to the smartphone is a categorical variable. The data usage for a current month (in GB) is a numerical variable that is continuous because any value within a range of values can occur. The length (in minutes and seconds) of the last voice call made using the smartphone is a numerical variable that is continuous because time can have any value from 0 to any reasonable unit of time. The number of apps installed on the smartphone is a numerical variable that is discrete because the outcome is a count. Copyright ©2024 Pearson Education, Inc.


(e)

Whether a device protection plan exists is a categorical variable because the answer can be only yes or no.

1.5

(a) (b) (c) (d) (e)

numerical, ratio numerical, ratio categorical, nominal categorical, nominal numerical, ratio

1.6

(a) (b) (c) (d) (e)

numerical, continuous categorical numerical, discrete categorical categorical

1.7

(a) (b) (c)

numerical, ratio scale, continuous numerical, ratio scale numerical, ratio scale, discrete

1.8

(a) (b) (c) (d) (e)

numerical, continuous numerical, discrete numerical, continuous categorical categorical

1.9

(a)

Income may be considered discrete if we ―count‖ our money. It may be considered continuous if we ―measure‖ our money; we are only limited by the way a country’s monetary system treats its currency. The first format would provide more information because it includes a ratio value while the second measure would only include a range of values for each choice category.

(b)

1.10

The variable test score would be numerical, and presumably, in the range of 0 through 100. If fractional credit for an answer is possible, the variable would need to be continuous and not discrete.

1.11

(a)

(b)

1.12

(a) (b) (c)

The population is ―members of the retailer’s rewards program from the metropolitan area.‖ A systematic or random sample could be taken of members from the rewards program from the metropolitan area. The director might wish to collect both numerical and categorical data. Three categorical questions might be occupation, marital status, type of clothing. Numerical questions might be age, average monthly hours shopping for clothing, income.

0001 0040 0902 Copyright ©2024 Pearson Education, Inc.


1.13

(a)

(b)

Sample without replacement: Start at row 29. Read from left to right in 3-digit sequences and continue unfinished sequences from end of row to beginning of next row. Row 29: 124 783 762 299 659 310 658 361 369 889 588 692 957 Rows 29-30: 157 Row 30: 175 555 646 541 142 547 704 570 342 672 937 837 Rows 30-31: 929 Row 31: 161 611 075 801 030 783 159 309 132 762 671 073 000 Row 32: 780 257 353 914 621 390 444 745 003 197 127 874 770 Rows 32-33: 927 Row 33: 587 672 288 014 510 175 128 228 668 765 530 493 Rows 33-34: 251 Row 34: 669 020 427 042 516 447 773 709 739 459 239 668 263 Row 35: 701 835 806 565 489 318 338 209 316 747 103 865 929 Row 35-36: 390 Row 36: 730 353 851 567 999 742 508 667 802 875 573 672 Rows 36-37: 571 Row 37: 093 493 242 134 312 459 002 770 485 820 090 658 595 Row 38: 824 623 016 Note: All sequences above 127 and all repeating sequences are discarded. Use the same technique as in part (a). Note: All sequences above 127 are discarded. There were no repeating sequences.

1.14

A simple random sample would be less practical for personal interviews because of travel costs, unless interviewees are paid to attend a central interviewing location.

1.15

This is a probability sample because the selection is based on chance. It is not a simple random sample because A is more likely to be selected than B or C.

1.16

Here all members of the population are equally likely to be selected and the sample selection mechanism is based on chance. But selection of two elements is not independent; for example if A is in the sample, we know that B is also, and that C and D are not.

1.17

(a)

(b) (c)

(d)

Since a complete roster of registered students exists, a simple random sample of 200 students could be taken. If student satisfaction with the quality of campus life randomly fluctuates across the student body, a systematic 1-in-20 sample could also be taken from the population frame. If student satisfaction with the quality of life may differ by status and by experience/class level, a stratified sample using eight strata, full-time freshmen through full-time seniors and part-time freshmen through part-time seniors, could be selected. If student satisfaction with the quality of life is thought to fluctuate as much within clusters as between them, a cluster sample could be taken. A simple random sample is one of the simplest to select. The population frame is the registrar’s file of 3,000 student names. A systematic sample is easier to select by hand from the registrar’s records than a simple random sample, since an initial person at random is selected and then every 20th person thereafter would be sampled. The systematic sample would have the additional benefit that the alphabetic distribution of sampled students’ names would be more comparable to the alphabetic distribution of student names in the campus population. If rosters by status and class designations are readily available, a stratified sample should be taken. Since student satisfaction with the quality of life may indeed differ by status and class level, the use of a stratified sampling design will not only ensure all strata are Copyright ©2024 Pearson Education, Inc.


(e)

1.18

1.19

(a)

(b)

0089 0189 0289 0389 0489 0589 0689 0789 0889 0989 1089 1189 1289 1389 1489 1589 1689 1789 1889 1989 2089 2189 2289 2389 2489 2589 2689 2789 2889 2989 3089 3189 3289 3389 3489 3589 3689 3789 3889 3989 4089 4189 4289 4389 4489 4589 4689 4789 4889 4989

(c)

With the single exception of invoice 0989, the invoices selected in the simple random sample are not the same as those selected in the systematic sample. It would be highly unlikely that a random process would select the same units as a systematic process.

(a)

A stratified sample should be taken so that each of the three strata will be proportionately represented. The number of observations in each of the three strata out of the total of 100 should reflect the proportion of the three categories in the customer database. For example, 500/1000 = 50% so 50% of 100 = 50 customers should be selected from the potential customers; similarly, 300/1000 = 30% so 30 customers should be selected from those who have purchased once, and 200/1000 = 20% so 20 customers from the repeat buyers. It is not simple random sampling because, unlike the simple random sampling, it ensures proportionate representation across the entire population.

(b)

(c)

1.20

represented in the sample, it will also generate a more representative sample and produce estimates of the population parameter that have greater precision. If all 3,000 registered students reside in one of 10 on-campus residence halls which fully integrate students by status and by class, a cluster sample should be taken. A cluster could be defined as an entire study house, and the students of a single randomly selected study house could be sampled. Since each study house has 300 students, a systematic sample of 150 students can then be selected from the chosen cluster of 300 students. Alternately, a cluster could be defined as a floor of one of the 10 study houses. Suppose there are six floors in each dormitory with 50 students on each floor. Three floors could be randomly sampled to produce the required 150 student sample. Selection of an entire study house may make distribution and collection of the survey easier to accomplish. In contrast, if there is some variable other than status or class that differs across study houses, sampling by floor may produce a more representative sample. Row 16: 2323 6737 5131 8888 1718 0654 6832 4647 6510 4877 Row 17: 4579 4269 2615 1308 2455 7830 5550 5852 5514 7182 Row 18: 0989 3205 0514 2256 8514 4642 7567 8896 2977 8822 Row 19: 5438 2745 9891 4991 4523 6847 9276 8646 1628 3554 Row 20: 9475 0899 2337 0892 0048 8033 6945 9826 9403 6858 Row 21: 7029 7341 3553 1403 3340 4205 0823 4144 1048 2949 Row 22: 8515 7479 5432 9792 6575 5760 0408 8112 2507 3742 Row 23: 1110 0023 4012 8607 4697 9664 4894 3928 7072 5815 Row 24: 3687 1507 7530 5925 7143 1738 1688 5625 8533 5041 Row 25: 2391 3483 5763 3081 6090 5169 0546 Note: All sequences above 5000 are discarded. There were no repeating sequences.

Before accepting the results of a survey of college students, you might want to know, for example: Who funded the survey? Why was it conducted? What was the population from which the sample was selected? What sampling design was used? What mode of response was used: a personal interview, a telephone interview, or a mail survey? Were interviewers trained? Were survey questions field-tested? What questions were asked? Were the questions clear, accurate, unbiased, valid? What operational definition of immediately and effortlessly was used? What was the response rate? Copyright ©2024 Pearson Education, Inc.


1.21

(a) (b) (c) (d)

Possible coverage error: Only employees in a specific division of the company were sampled. Possible nonresponse error: No attempt is made to contact nonrespondents to urge them to complete the evaluation of job satisfaction. Possible sampling error: The sample statistics obtained from the sample will not be equal to the parameters of interest in the population. Possible measurement error: Ambiguous wording in questions asked on the questionnaire.

1.22

The results are based on a survey of bank executives. If the frame is supposed to be banking institutions, how is the population defined? There is no information about the response rate, so there is an undefined nonresponse error.

1.23

Before accepting the results of the survey, you might want to know, for example: Who funded the survey? Why was it conducted? What was the population from which the sample was selected? What sampling design was used? What mode of response was used: a personal interview, a telephone interview, or a mail survey? Were interviewers trained? Were survey questions field-tested? What questions were asked? Were they clear, accurate, unbiased, valid? What was the response rate? What was the margin of error? What was the sample size? What frame was used?

1.24

Before accepting the results of the survey, you might want to know, for example: Who funded the survey? Why was it conducted? What was the population from which the sample was selected? What sampling design was used? What mode of response was used: a personal interview, a telephone interview, or a mail survey? Were interviewers trained? Were survey questions field-tested? What questions were asked? Were the questions clear, accurate, unbiased, valid? What was the response rate? What was the margin of error? What was the sample size? What frame was used?

1.25

Only the second value, 2.7GB, contains units.

1.26

(a) (b)

Invalid values include Appel, Samsun, APPLE, Apple iPhone, and mOTOROLA. Appel should be Apple. Samsun should be Samsung. APPLE should be Apple. Apple iPhone should be Apple. mOTOROLA should be Motorola.

1.27

(a)

Employee ID has data integration error. Payment Method should do a domain check. Age should mark missing value and do domain check and should check outliers. Division should do a domain check and look for data integration errors. Department has data integration errors and should check domain. In Employee ID the first and eleventh value are identical, as well as the sixteenth value ―EPM16‖ should be ―EMP16‖. In Payment Method, corrections to values ―Male‖, ―F‖, and ―EMP12‖ should be corrected as well as identifying the missing ninth value. In Age, ―k20‖ should be ―20‖. Also, ―221‖ should be corrected as well as identifying the missing seventeenth value. In Division, ―EAST‖ should be ―East‖, ―N‖ should be ―North‖, ―N‖ should be ―North‖, and ―Nroth‖ should be ―North‖. In Department ―Customer Rel.‖ should be ―Customer Relations‖, ―Brand Mgt.‖ should be ―Brand Mgt‖, ―Operati0ns‖ should be ―Operations‖, ―Human capital‖ should be ―Human Capital‖, and ―FIN‖ should be ―Finance‖.

(b)

Copyright ©2024 Pearson Education, Inc.


1.28

(a)

(b) (c)

Fund Number contains the wrong data. Category should do a domain check. 5-Yr Return should format the number as a percentage. 10-Yr Retrun should mark missing value. Net Expense Ratio needs a domain check and should check outliers. Rating should do a domain check and check outliers. Assets needs a domain check and should mark missing value. The last three values in New Expense Ratio; the eleventh value in Assets. For both New Expense Ratio and Assets, a maximum value could be defined.

1.29

Data cleaning improves the quality of the data while data wrangling changes the organization of the data.

1.30

No, because the categories do not seem to be mutually exclusive. The categories need specific ranges, such as younger than 21, 21 to 23, 35 to 54, and 55 or older.

1.31

Stack the data by Housekeeping Requested values.

1.32

(a) (b)

1.33

For data cleaning, 10-year return and Rating should mark missing values. For data wrangling, the data could be stacked by market cap, fund type, YTD return, 1-year return, 10-year return, life of fund, expense ratio, or fund rating.

1.34

A population contains all the items of interest whereas a sample contains only a portion of the items in the population.

1.35

A statistic is a summary measure describing a sample whereas a parameter is a summary measure describing an entire population.

1.36

Categorical random variables yield categorical responses such as yes or no answers. Numerical random variables yield numerical responses such as your height in inches.

1.37

Discrete random variables produce numerical responses that arise from a counting process. Continuous random variables produce numerical responses that arise from a measuring process.

1.38

Both nominal and ordinal variables are categorical variables but no ranking is implied in nominal variable such as male or female while ranking is implied in ordinal variable such as a student’s grade of A, B, C, D and F.

1.39

Both interval and ratio variables are numerical variables in which the difference between measurements is meaningful but an interval variable does not involve a true zero such as standardized exam scores while a ratio variable involves a true zero such as height.

1.40

A list of values defines a domain for a categorical variable, whereas a range defines a domain for a numerical variable.

The times for each of the hotels would be arranged in separate columns. The hotel names would be in one column and the times would be in a second column.

Copyright ©2024 Pearson Education, Inc.


1.41

Items or individuals in a probability sampling are selected based on known probabilities while items or individuals in a nonprobability samplings are selected without knowing their probabilities of selection.

1.42

Missing values are values that were not collected for a variable. Outliers are values that seem excessively different from most of the other values

1.43

In unstacked arrangements, separate numerical variables are created for each group in the data. For example, you might create a variable for the weights of men and a second variable for the weights of women. In stacked arrangements, a single numerical variable is paired with a categorical variable that represents the categories. For example, all weights would be in one variable, with a categorical variable indicating male or female.

1.44

Coverage error is error generated due to an improperly or inappropriately framed population which can result in a sample that may not be representative of the population that one wishes to study. Non-response error is error generated due to members of a chosen sample not being contacted even after repeated attempts so that information that should be provided is missing.

1.45

Sampling error results from the variability of outcomes of different samples. This sample to sample variation is inevitably connected to the sampling process. Measurement error is error that results from either self-reported data or data that is collected in an inconsistent manner by those who are responsible for collecting and summarizing the desired information.

1.46

Microsoft Excel: This product features a spreadsheet-based interface that allows users to organize, calculate, and organize data. Excel also contains many statistical functions to assist in the description of a dataset. Excel can be used to develop worksheets and workbooks to calculate a variety of statistics including introductory and advanced statistics. Excel also includes interactive tools to create graphs, charts, and pivot tables. Excel can be used to summarize data to better understand a population of interest, compare across groups, predict outcomes, and to develop forecasting models. These capabilities represent those that are generally relevant to the current course. Excel also includes many other statistical capabilities that can be further explored on the Microsoft Office Excel official website.

1.47

(a) (b) (c) (d)

1.48

The population of interest include banking executives representing institutions of various sizes and U.S. geographic locations. The collected sample includes 163 banking executives from institutions of various sizes and U.S. geographic locations. A parameter of interest is the percentage of the population of banking executives that identify customer experience initiatives as an area where increased spending is expected. A statistic used to the estimate the parameter in (c) is the percentage of the 163 banking executives included in the sample who identify customer experience initiatives as an area where increased spending is expected. In this case, the statistic is 55%.

The answers are based on an article titled ―U.S. Satisfaction Still Running at Improved Level‖ and written by Lydia Saad (August 15, 2018). The article is located on the following site: https://news.gallup.com/poll/240911/satisfaction-running-improvedlevel.aspx?g_source=link_NEWSV9&g_medium=NEWSFEED&g_campaign=item_&g_content =U.S.%2520Satisfaction%2520Still%2520Running%2520at%2520Improved%2520Level Copyright ©2024 Pearson Education, Inc.


(a) (b) (c)

(d)

1.49

The answers were based on information obtained from the following site: (a) (b) (c)

(d)

1.50

(a)

(b) (c)

1.51

The population of interest includes all individuals aged 18 and older who live within the 50 U.S. states and the District of Columbia. The collected sample includes a random sample of 1,024 individuals aged 18 and older who live within the 50 U.S. states and the District of Columbia. A parameter of interest is the percentage of the population of individuals aged 18 and older and live within the 50 U.S. states and the District of Columbia who are satisfied with the direction of the U.S. A statistic used to the estimate the parameter in (c) is the percentage of the 1,024 individuals included in the sample. In this case, the statistic is 36%.

The population of interest is U.S. CEOs The sample included 1,000 U.S. CEOs. A parameter of interest would be the percentage of CEOs among the population of interest that believe that AI will significantly change the way they will do business in the next five years. The statistic used to estimate the parameter in (c) is the percentage of CEOs among the 1,000 CEOs included in the sample who believe that AI will significantly change the way they will do business in the next five years. In this case, the statistic is 80% agree with this statement One variable collected with the American Community Survey is marital status with the following possible responses: now married, widowed, divorced, separated, and never married. The variable in (a) represents a categorical variable. Because the variable in (a) is a categorical, this question is not applicable. If one had chosen age in years from the American Community Survey as the variable, the answer to (c) would be discrete.

Answers will vary depending on the specific sample survey used. The below answers were based on the sample survey located at: bit.ly/21qjI6F (a) (b)

An example of a categorical variable included in the survey is gender with male or female as possible answers. An example of a numerical variable included in the survey would be the number of phone calls made or received from or to ones direct supervisor in an average week.

1.52

(a) (b) (c)

The population of interest consisted of 10,000 benefited employees of the University of Utah. The sample consisted of 3,095 employees of the University of Utah. Gender, marital status, and employment category represent categorical variables. Age in years, education level in years completed, and household income represent numerical variables.

1.53

(a)

Key social media platforms used represents a categorical variable. The frequency of social media usage represents a discrete numerical variable. Demographics of key social media platform users represent categorical variables. 1. Which of the following is your preferred social media platform: YouTube, Facebook, or Twitter? 2. What time of the day do you spend the most amount of time using social media: morning, afternoon, or evening? 3. Please indicate your ethnicity? 4. Which of the following do you most often use to access social media: mobile device, laptop computer, desktop computer, other device?

(b)

Copyright ©2024 Pearson Education, Inc.


(c)

5. Please indicate whether you are a home owner: Yes or No? 1. For the past week, how many hours did you spend using social media? 2. Please indicate your current age in years. 3. What was your annual income this past year? 4. Currently, how many friends have you accepted on Facebook? 5. Currently, how many twitter followers do you have?

Chapter 2

2.1

(a) Category A B C

2.2

Frequency 13 28 9

Percentage 26% 56% 18%

(b)

Category ―B‖ is the majority.

(a)

Table frequencies for all student responses Status F/T P/T Totals

(b)

Student Major Categories A B M Totals 14 9 2 25 6 6 3 15 20 15 5 40

Table percentages based on overall student responses Student Major Categories Status F/T P/T Totals

A 35.0% 15.0% 50.0%

B 22.5% 15.0% 37.5%

M 5.0% 7.5% 12.5%

Totals 62.5% 37.5% 100.0%

Table based on row percentages Student Major Categories Status F/T P/T Totals

A 56.0% 40.0% 50.0%

B 36.0% 40.0% 37.5%

M 8.0% 20.0% 12.5%

Totals 100.0% 100.0% 100.0%

Table based on column percentages Student Major Categories Status F/T P/T Totals

2.3

(a)

A 70.0% 30.0% 100.0%

B 60.0% 40.0% 100.0%

M 40.0% 60.0% 100.0%

Totals 62.5% 37.5% 100.0%

You can conclude Apple, Samsung, and Others dominated the market from the third quarter of 2020 through the third quarter of 2021. Others has the largest market share in Q3 2020, but decreased from 38% to 31% in the third quarter of 2021. Samsung also decreased from 22% in Q3 2020 to 18% in Q3 2021. However, Apple gained in market Copyright ©2024 Pearson Education, Inc.


(b)

share increasing from 11% in the third quarter of 2020 to 15% in the third quarter of 2020. Apple, OPPPO, vivo, and Xiaoml increased market share while Samsung and Others decreased market share.

Copyright ©2024 Pearson Education, Inc.


2.4

(a) Category

Total

Percentages

Credit reporting, credit repair

683,189

63.35%

Debt collection

163,512

15.12%

Credit card or prepaid card

88,175

8.15%

Checking or savings account

72,555

6.71%

72,241

6.68%

Mortgage Total

(b)

1,081,672

There are more complaints for credit reporting, debt collection, and credit card or prepaid card than the other categories. These categories account for about 87% of all the complaints.

(c) Category

Total

Percentage

General purpose credit card or charge card General purpose prepaid card

5836

78.36%

240

3.22%

Gift card

24

0.32%

Government benefit card

253

3.40%

Payroll card

26

0.35%

1065

14.30%

4

0.05%

Store credit card Student prepaid card Total

(d)

7,448

The bulk of the complaints were for general purpose credit card or charge card, and for store credit card.

2.5

The respondents from the sample indicated that about half expect increases in budget for the upcoming year in all three categories: data visualization tools, 48%, advanced analytics, 53%, and applied AI solutions, 52%. The respondents indicated little support for decrease or no investment, with responses to each those categories at 6% or less.

2.6

The largest sources of electricity in the United States are natural gas followed equally by coal, nuclear, and renewables.

Copyright ©2024 Pearson Education, Inc.


2.7

(a) Cloud Value Measure

2.8

Frequency

Percentage

Cost savings

74

14.12%

Increased revenue

10

1.91%

Faster innovation and delivery of new digital products and services Ability to execute on strategy to fundamentally change the business Improved operation resilience, safety, and soundness We are not specifically measuring value because cloud is seen as a necessary business foundation Unsure

127

24.24%

90

17.18%

105

20.04%

21

4.01%

6

1.15%

Total

524

(b)

Executives expect the greatest value of the cloud is in faster innovation and delivery of new digital products and services followed by improved operation resilience, safety, and soundness. Executives do not expect to find cloud value in increased revenue, we are not specifically measuring value because cloud is seen as a necessary business foundation, and unsure.

(a)

Table of row percentages: Student Status Pizza Preference

Full-time

Part-time

Total

Local

48.97%

51.03%

100.00%

National chain

74.67%

25.33%

100.00%

Total

57.73%

42.27%

100.00%

Table of column percentages: Student Status Pizza Preference

Full-time

Part-time

Local

55.91%

79.57%

65.91%

Total

National chain

44.09%

20.43%

34.09%

Total

100.00%

100.00%

100.00%

Table of total percentages: Student Status Pizza Preference

(b)

Full-time

Part-time

Total

Local

32.27%

36.64%

65.91%

National chain

25.45%

8.64%

34.09%

Total

57.73%

42.27%

100.00%

The full-time students were more likely to prefer the local pizza. The part-time students were much more likely to prefer the local pizza.

Copyright ©2024 Pearson Education, Inc.


2.9

(a)

Table of row percentages: Location Churned

Ashland

Springville

Total

Yes

49.72%

50.28%

100.00%

No

50.60%

49.40%

100.00%

Total

50.32%

49.68%

100.00%

Table of column percentages: Location Churned

Ashland

Springville

Total

Yes

31.45%

32.21%

31.83%

No

68.55%

67.79%

68.17%

Total

100.00%

100.00%

100.00%

Table of total percentages: Location Churned

(b)

Ashland

Springville

Total

Yes

15.82%

16.00%

31.83%

No

34.49%

33.68%

68.17%

Total

50.32%

49.68%

100.00%

The customers in Ashland are not likely to churn. The customers in Springville are not likely to churn.

Copyright ©2024 Pearson Education, Inc.


2.10

(a)

Summary of results: Paperless Billing Churned

Yes

No

Total

Yes

1,358

398

1,756

No

2,367

1,394

3,761

Total

3,725

1,792

5,517

Table of row percentages: Paperless Billing Churned

Yes

No

Total

Yes

77.33%

22.67%

100.00%

No

62.94%

37.06%

100.00%

Total

67.52%

32.48%

100.00%

Table of column percentages: Paperless Billing Churned

Yes

No

Total

Yes

36.46%

22.21%

31.83%

No

63.54%

77.79%

68.17%

Total

100.00%

100.00%

100.00%

Table of total percentages: Paperless Billing

(b)

Churned

Yes

No

Total

Yes

24.61%

7.21%

31.83%

No

42.90%

25.27%

68.17%

Total

67.52%

32.48%

100.00%

The customers who had paperless billing were more likely to churn.

2.11

Ordered array: 64 68 71 75 81 88 94

2.12

Ordered array: 73 78 78 78 85 88 91

2.13

(a) (b) (c) (d)

2.14

(166 + 100)/591 * 100 = 45.01% (124 + 77)/591 * 100 = 34.01% (59 + 65)/591 * 100 = 20.98% 45% of the incidents took fewer than 2 days and 66% of the incidents were detected in less than 8 days. 79% of the incidents were detected in less than 31 days.

261,000  61,000  33,333.33 so choose 40,000 as interval width 6 (a) $60,000 – under $100,000; $100,000 – under $140,000; $140,000 – under $180,000; $180,000 – under $220,000; $220,000 – under $260,000; $260,000 – under $300,000 Copyright ©2024 Pearson Education, Inc.


(b) 2.14

(c)

cont. 2.15

$40,000 60,000  100,000  $80,000 similarly, the remaining class midpoints are $120,000; 2 $160,000; $200,000; $240,000; $280,000

(a)Franchise valuations ordered array: 0.990 1.100 1.110 1.180 1.190 1.280 1.300 1.320 1.375 1.380 1.385 1.390 1.400 1.575 1.700 1.760 1.780 1.980 2.000 2.050 2.100 2.200 2.300 2.450 2.650 3.500 3.800 3.900 4.075 6.000 (b)

The valuations range from 0.990 to 6.000. Franchise Valuations Frequency 0.990 but less than 1.700 15 1.700 but less than 2.400 8 2.400 but less than 3.100 2 3.100 but less than 3.800 2 3.800 but less than 4.500 2 4.500 but less than 5.200 0 5.200 to 6.000 1 Half of the valuations are less than 1.700.

(c)

Payrolls ordered array: 48.06 58.08 60.39 61.66 79.78 83.92 93.25 94.93 104.64 113.02 129.54 132.19 133.14 135.29 136.91 146.20 151.48 157.87 158.20 170.74 174.36 181.89 184.63 190.37 211.87 211.97 224.45 236.84 266.00 284.73

(d)

The payrolls range from 48.06 to 284.73. Payroll Frequency 48.06 but less than 82.08 5 82.08 but less than 116.06 5 116.06 but less than 150.06 6 150.06 but less than 184.06 6 184.06 but less than 218.06 4 218.06 but less than 252.06 2 252.06 to 284.73 2 Payroll seems centered around 150.06.

2.16

Percentage 50.0% 26.7% 6.7% 6.7% 6.7% 0% 3.3%

Percentage 16.7% 16.7% 20.0% 20.0% 13.3% 6.7% 6.7%

(a) Total Housing Cost Total Housing Cost

Frequency

Percentage

$250 but less than $300

4

7.84%

$300 but less than $350

17

33.33%

$350 but less than $400

15

29.41%

$400 but less than $450

12

23.53%

$450 but less than $500

1

1.96%

$500 but less than $550

2

3.92%

Copyright ©2024 Pearson Education, Inc.


Copyright ©2024 Pearson Education, Inc.


2.16 cont.

(b) Total Housing Cost

(c)

2.17

Frequency

Percentage

Cumulative %

$250

0

0.00%

0.00%

$300

4

7.84%

7.80%

$350

17

33.33%

41.17%

$400

15

29.41%

70.59%

$450

12

23.53%

94.12%

$500

1

1.96%

96.08%

$550

2

3.92%

100.00%

The apartment costs are clustered between $300 and $450.

(a) Commuting Time

Frequency

Percentage

19 but less than 22

15

15%

22 but less than 25

38

38%

25 but less than 28

27

27%

28 but less than 31

13

13%

31 but less than 34

5

5%

34 but less than 37

2

2%

Frequency

Percentage

19 but less than 22

15

15%

15%

22 but less than 25

38

38%

53%

25 but less than 28

27

27%

80%

28 but less than 31

13

13%

93%

31 but less than 34

5

5%

98%

34 but less than 37

2

2%

100%

(b) Commuting Time

(c)

Cumulative %

More than half of commuters spend from 19 up to 25 minutes commuting each week. 93% of commuters spend from 19 up to 31 minutes commuting each week.

Copyright ©2024 Pearson Education, Inc.


2.18

(a), (b)

(c)

2.19

Credit Score

Frequency

Percent (%)

670 – under 680

1

1.96

1.96

680 – under 690

4

7.84

9.80

690 – under 700

8

15.69

25.49

700 – under 710

5

9.80

35.29

710 – under 720

12

23.53

58.82

720 – under 730

14

27.45

86.27

730 – under 740

7

13.73

100.00

The average credit scores are concentrated between 710 and 730.

(a), (b) Bin –0.00350 but less than –0.00201 –0.00200 but less than –0.00051 –0.00050 but less than 0.00099 0.00100 but less than 0.00249 0.00250 but less than 0.00399 0.004 but less than 0.00549

(c)

2.20

Cumulative Percent (%)

Frequency 13 26 32 20 8 1

Percentage 13.00% 26.00% 32.00% 20.00% 8.00% 1.00%

Cumulative % 13.00% 39.00% 71.00% 91.00% 99.00% 100.00%

Yes, the steel mill is doing a good job at meeting the requirement as there is only one steel part out of a sample of 100 that is as much as 0.005 inches longer than the specified requirement.

(a), (b) Time in Seconds

Frequency

Percent (%)

5 – under 10

8

16%

10 – under 15

15

30%

15 – under 20

18

36%

20 – under 25

6

12%

25 – under 30

3

6%

(b) Time in Seconds

(c)

Percentage Less Than

5

0

10

16

15

46

20

82

25

94

30

100

The target is being met since 82% of the calls are being answered in less than 20 seconds

Copyright ©2024 Pearson Education, Inc.


2.21

(a) Call Duration (seconds) 60 up to 119 120 up to 179 180 up to 239 240 up to 299 300 up to 359 360 up to 419 420 and longer

Frequency 7 12 11 11 4 3 2 50

Percentage 14% 24% 22% 22% 8% 6% 4% 100%

Call Duration (seconds)

Frequency

Percentage

(b)

(c)

2.22

Cumulative %

60 up to 119

7

14%

14%

120 up to 179

12

24%

38%

180 up to 239

11

22%

60%

240 up to 299

11

22%

82%

300 up to 359

4

8%

90%

360 up to 419

3

6%

96%

420 and longer

2

4%

100%

50

100%

The call center’s target of call duration less than 240 seconds is only met for 60% of the calls in this data set.

(a)

(b)

Copyright ©2024 Pearson Education, Inc.


2.22 cont.

(c)

2.23

(a)

(b)

2.24

Manufacturer B produces bulbs with longer lives than Manufacturer A. The cumulative percentage for Manufacturer B shows 65% of its bulbs lasted less than 50,500 hours, contrasted with 70% of Manufacturer A’s bulbs, which lasted less than 49,500 hours. None of Manufacturer A’s bulbs lasted more than 51,500 hours, but 12.5% of Manufacturer B’s bulbs lasted between 51,500 and 52,500 hours. At the same time, 7.5% of Manufacturer A’s bulbs lasted less than 47,500 hours, whereas all of Manufacturer B’s bulbs lasted at least 47,500 hours

Amount of Soft Drink 1.850 – 1.899 1.900 – 1.949 1.950 – 1.999 2.000 – 2.049 2.050 – 2.099 2.100 – 2.149

Frequency 1 5 18 19 6 1

Percentage 2% 10% 36% 38% 12% 2%

Amount of Soft Drink 1.899 1.949 1.999 2.049 2.099 2.149

Frequency Less Than 1 6 24 43 49 50

Percentage Less Than 2% 12% 48% 86% 98% 100%

The amount of soft drink filled in the two liter bottles is most concentrated in two intervals on either side of the two-liter mark, from 1.950 to 1.999 and from 2.000 to 2.049 liters. Almost three-fourths of the 50 bottles sampled contained between 1.950 liters and 2.049 liters.

(a)

Average per Month Mail order

0.1

Prepaid

0.8

Online bill pay

1.9

Check

2.3

Bank account

2.3

Cash

6.5

Credit

9.4

Debit

9.8 0

2

4

6

Copyright ©2024 Pearson Education, Inc.

8

10

12


2.24 cont.

(a)

Average per Month 0.8

0.1

1.9 2.3

9.8

2.3

6.5 9.4

Debit

Credit

Cash

Bank account

Check

Online bill pay

Prepaid

Mail order

Average per Month 100% 80% 60% 40% 20% 0%

100% 80% 60% 40% 20% 0%

Purchase Method

(b) (c)

The Pareto chart is best for portraying these data because it not only sorts the frequencies in descending order but also provides the cumulative line on the same chart. You can conclude that debit, credit, and cash are the ―vital few‖ transaction methods.

Copyright ©2024 Pearson Education, Inc.


2.25

(a)

Hours Spent Grooming

0.8

Eating and drinking

1.0

Traveling

1.4

Others

2.2

Working and related activities

2.3

Educational activities

3.5

Leisure and sports

4.0

Sleeping

8.8 0.0

2.0

4.0

6.0

8.0

Hours Spent 1.4

1.0 0.8

8.8

2.2 2.3 3.5

4.0

Sleeping

Leisure and sports

Educational activities

Working and related activities

Others

Traveling

Eating and drinking

Grooming

Copyright ©2024 Pearson Education, Inc.

10.0


2.25 cont.

(a)

Hours Spent 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0%

100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0%

Activity (b)

(c)

2.26

The Pareto diagram is better than the pie chart or the bar chart because it not only sorts the frequencies in descending order, it also provides the cumulative polygon on the same scale. From the Pareto diagram it is obvious that more than 50% of their day is spent sleeping, taking part in leisure and sports, and educational activities.

(a)

(b)

19% + 20% + 40% = 79%

Copyright ©2024 Pearson Education, Inc.


2.26 cont.

(c)

(d)

2.27

The Pareto diagram is better than the pie chart because it not only sorts the frequencies in descending order, it also provides the cumulative polygon on the same scale.

(a)

(b)

The ―vital few‖ reasons for the categories of complaints are ―Credit Reporting, credit repair‖ and ―Debt Collection‖ which account for more than 78% of the complaints. The remaining are the ―trivial many‖ which make up less than 22% of the complaints.

Copyright ©2024 Pearson Education, Inc.


2.27 cont.

(c)

(d)

The Pareto diagram is better than the pie chart and bar chart because it allows you to see which categories account for most of the complaints.

Copyright ©2024 Pearson Education, Inc.


2.28

(a)

Copyright ©2024 Pearson Education, Inc.


2.28 cont.

(a)

(b) (c)

2.29

Because energy use is spread over many types of appliances, a bar chart may be best in showing which types of appliances used the most energy. Air conditioning, space heating, water heating, and lighting accounted for over one-half (55.7%) of the residential energy use in the United States.

(a)

Copyright ©2024 Pearson Education, Inc.


2.29 cont.

(a)

(b) 2.30

Two-thirds of the market share percentage (66%) is for Starbucks and Dunkin.

(a)

(b)

Part-time students are more likely to order from a local restaurant than full-time students.

Copyright ©2024 Pearson Education, Inc.


2.31

(a)

(b) 2.32

Both Ashland and Springville have the about the same amount of churning.

(a)

(b)

More paperless billing customers churned that customers that do not have paperless billing.

2.33 Stem-and-leaf of Finance Scores 5 34 6 9 7 4 9 38 2.34

Ordered array: 50 74 74 76 81 89 92 Copyright ©2024 Pearson Education, Inc.


2.35

(a)

(b)

(c) (d)

2.36

(a)

Ordered array: 9.1 9.4 9.7 10.0 10.2 10.2 10.3 10.8 11.1 11.2 11.5 11.5 11.6 11.6 11.7 11.7 11.7 12.2 12.2 12.3 12.4 12.8 12.9 13.0 13.2 The stem-and-leaf display conveys more information than the ordered array. We can more readily determine the arrangement of the data from the stem-and-leaf display than we can from the ordered array. We can also obtain a sense of the distribution of the data from the stem-and-leaf display. The most likely gasoline purchase is between 11 and 11.7 gallons. Yes, the third row is the most frequently occurring stem in the display and it is located in the center of the distribution. Stem unit: 1 ($billions) Stem unit: ($billions) 1

01122333444446788

2

00012357

3

589

4

1

5 6

0

(b)

The values are concentrated between $1 billion and $2 billion with one $6 billion (the New York Yankees).

(c)

Stem unit: 10 ($millions) Stem unit: ($millions) 4

8

17

14

5

8

18

25

6

02

19

0

7

20

8

04

21

22

9

35

22

4

10

5

23

7

11

3

24

12 13

25 02357

26

6

Copyright ©2024 Pearson Education, Inc.


14

6

27

15

188

28

5

16 (d)

2.37

The payrolls are spread out between $48 million and $285 million with some concentration between $130 million and $150 million.

(a) Download Speed 6.529.431.132.532.836.337.153.3 Upload Speed 3.74.05.812.913.015.616.917.5

Copyright ©2024 Pearson Education, Inc.


2.37 cont.

(b)

Download Speeds: Stem unit:10 Leaf rounded to nearest integer

Upload Speeds: Stem unit 1

(c)

(d)

2.38

The stem-and-leaf display conveys more information than the ordered array. We can more readily determine the arrangement of the data from the stem-and-leaf display than we can from the ordered array. We can also obtain a sense of the distribution of the data from the stem-and-leaf display. Download speeds are concentrated around 30 Mbps and Upload speeds are varied with a group around 3 to 5 Mbps and a group around 13 to 17 Mbps.

(a)

Copyright ©2024 Pearson Education, Inc.


2.38 cont.

(a)

(b)

(c)

The majority of electricity charges are clustered between $90 and $130.

2.39

The cost of attending a baseball game is concentrated around $65 with twelve teams at that cost. Five teams have costs of $85 and one team is has the highest cost of $115.

2.40

Property taxes on a $176K home seem concentrated between $700 and $2,200 and also between $3,200 and $3,700.

Copyright ©2024 Pearson Education, Inc.


2.41

(a)

(b)

(c)

The majority (79%) of commuters living in cities spend from 28 but less than 28 minutes commuting each week.

Copyright ©2024 Pearson Education, Inc.


2.42

(a)

(b)

(c)

The average credit scores are concentrated between 710 and 740.

Copyright ©2024 Pearson Education, Inc.


2.43

(a)

(b)

2.44

Yes, the steel mill is doing a good job at meeting the requirement as there is only one steel part out of a sample of 100 that is as much as 0.005 inches longer than the specified requirement.

(a)

Copyright ©2024 Pearson Education, Inc.


2.44 cont.

(a)

(b)

(c)

The target is being met since 82% of the calls are being answered in less than 20 seconds.

Copyright ©2024 Pearson Education, Inc.


2.45

(a)

Copyright ©2024 Pearson Education, Inc.


2.45 cont.

(b)

(c)

2.46

The call center’s target of call duration less than 240 seconds is only met for 60% of the calls in this data set.

(a)

Copyright ©2024 Pearson Education, Inc.


2.46 cont.

(a)

(b)

Copyright ©2024 Pearson Education, Inc.


2.46 cont.

(b)

(c) 2.47

Manufacturer B produces bulbs with longer lives than Manufacturer A

(a)

Copyright ©2024 Pearson Education, Inc.


2.47 cont.

(b) Amount of Soft Drink 1.899 1.949 1.999 2.049 2.099 2.149

(c)

(a)

Scatter Plot 10 9 8 7 6 5 4 3 2 1 0

0

2

4

6

8

10

X

(b)

Percentage Less Than 2% 12 48 86 98 100

The amount of soft drink filled in the two liter bottles is most concentrated in two intervals on either side of the two-liter mark, from 1.950 to 1.999 and from 2.000 to 2.049 liters. Almost three-fourths of the 50 bottles sampled contained between 1.950 liters and 2.049 liters.

Y

2.48

Frequency Less Than 1 6 24 43 49 50

There is no relationship between X and Y.

Copyright ©2024 Pearson Education, Inc.


2.49

(a)

(b)

2.50

Annual sales appear to be increasing in the earlier years before 2014, remain flat from 2015 to 2017 and then start to decline after 2018.

(a)

Copyright ©2024 Pearson Education, Inc.


2.50 cont.

(b)

(c)

There appears to be a linear relationship between the first weekend gross and both the U.S. gross or the worldwide gross of Wizarding World movies. However, this relationship is greatly affected by the results of the Deathly Hallows, Part II movie.

Copyright ©2024 Pearson Education, Inc.


2.51

(a)

(b)

Copyright ©2024 Pearson Education, Inc.


2.51 cont.

2.52

(c)

(d)

There appears to be a positive relationship between Cable and Internet. There does not seem to be a relationship between Electricity and cable, nor between Electricity and Internet.

(a)

There appears to be little relationship between the download speed and the upload speed. Although, the carrier with the highest download speed also has the highest upload speed, as one might guess.

(b)

(c)

Yes, this is borne out by the data.

Copyright ©2024 Pearson Education, Inc.


2.53

(a)

(b)

Copyright ©2024 Pearson Education, Inc.


2.53 cont.

2.54

(c)

(d)

There does appear to be some relationship between Value and Payroll, also between Wins and Payroll.

(a)

Excel output:

(b)

There is a great deal of variation in the returns from decade to decade. Most of the returns are between 5% and 15%. The 1950s, 1980s, and 1990s had exceptionally high returns, and only the 1930s had negative returns.

Copyright ©2024 Pearson Education, Inc.


2.55

(a)

(b)

2.56

There is an upward trend in home sales prices until 2006. Prices decline or remain flat from 2006 – 2011. From 2011 – 2016 there is an upward trend in median price of new home sales. Prices decline or remain flat from 2017 – 2019. From 2020 to 2022, there is an upward trend in home sales prices.

(a)

Copyright ©2024 Pearson Education, Inc.


2.56 cont.

(b)

2.57

(a)

2.58

There was a decline in movie attendance from 2001 to 2021. During that time movie attendance first increased in 2002 before starting a long, slow decline through 2019. In 2020, movie attendance suffered a sharp drop, but then significantly increased in 2021.

(b)

From 2004 to 2008 the cost of a 30-second ad was constant at 2.7 million dollars. Since 2010, the cost has increased to its highest level of 5.6 million dollars in 2020 and then decreased to 5.5 million dollars in 2021.

(a)

Pivot Table in terms of % Count of Type

Row Labels Growth

Column Labels

One

Two

Three

Four

Five

Grand Total

3.11% 13.94% 22.60% 11.77% 4.47%

55.89%

Small

0.41%

2.71%

6.22%

1.62% 1.49%

12.45%

Mid-Cap

0.95%

3.52%

4.74%

3.92% 0.68%

13.80%

Large

1.76%

7.71% 11.64%

6.22% 2.30%

29.63%

2.57% 11.37% 16.78%

9.07% 4.33%

44.11%

Small

0.54%

1.62%

3.11%

2.03% 0.68%

7.98%

Mid-Cap

0.68%

2.30%

2.84%

1.62% 0.54%

7.98%

Value

Copyright ©2024 Pearson Education, Inc.


Large Grand Total (b)

1.35%

7.44% 10.83%

5.41% 3.11%

28.15%

5.68% 25.30% 39.38% 20.84% 8.80%

100.00%

The growth and value funds have similar patterns in terms of star rating and type. Both growth and value funds have more funds with a rating of three. Very few funds have ratings of five. The growth and value funds have similar patterns in terms of market cap with many more large cap funds than mid-cap or small funds.

Copyright ©2024 Pearson Education, Inc.


2.58 cont.

(c)

Pivot Table in terms of Average Three-Year Return

Average of 3Yr Return

Column Labels

Row Labels

One

Two

Three

Four

Five

Grand Total

Growth

8.94

14.29

16.93

18.32

24.90

16.76

Small

7.86

10.79

14.35

15.10

22.75

14.47

Mid-Cap

9.41

13.03

15.69

16.69

20.92

15.12

Large

8.94

16.10

18.82

20.18

27.45

18.48

8.71

11.24

12.27

13.94

16.63

12.57

Small

6.06

10.37

11.62

12.54

14.16

11.44

Mid-Cap

9.15

11.33

12.04

14.23

16.10

12.31

Large

9.55

11.40

12.52

14.38

17.26

12.97

Grand Total

8.84

12.92

14.95

16.41

20.83

14.91

Value

2.59

(d)

The average 3-year return is directly related to star rating for both growth and value funds, with high rated funds having a higher 3-year average rate of return. For growth funds, the average three-year rate of return is approximately the same for all market levels.

(a)

Pivot table of tallies in terms of counts: Count of Star Rating

Column Labels

Row Labels

One

Two Three Four Five

Grand Total

Small

7

32

69

27

16

151

Low

1

5

14

6

3

29

Average

2

9

20

6

6

43

High

4

18

35

15

7

79

Mid-Cap

12

43

56

41

9

161

Low

3

12

20

9

1

45

Copyright ©2024 Pearson Education, Inc.


Average

3

17

26

22

3

71

High

6

14

10

10

5

45

Large

23

112

166

86

40

427

Low

7

29

58

28

16

138

Average

11

43

80

39

19

192

High

5

40

28

19

5

97

Grand Total

42

187

291

154

65

739

Copyright ©2024 Pearson Education, Inc.


2.59 cont.

(a)

Pivot table of tallies in terms of % of grand total:

Count of Star Rating

Column Labels

Row Labels

One

Two

Three

Four

Five

Grand Total

Small

0.95%

4.33%

9.34%

3.65%

2.17%

20.43%

Low

0.14%

0.68%

1.89%

0.81%

0.41%

3.92%

Average

0.27%

1.22%

2.71%

0.81%

0.81%

5.82%

High

0.54%

2.44%

4.74%

2.03%

0.95%

10.69%

Mid-Cap

1.62%

5.82%

7.58%

5.55%

1.22%

21.79%

Low

0.41%

1.62%

2.71%

1.22%

0.14%

6.09%

Average

0.41%

2.30%

3.52%

2.98%

0.41%

9.61%

High

0.81%

1.89%

1.35%

1.35%

0.68%

6.09%

Large

3.11%

15.16% 22.46% 11.64% 5.41%

57.78%

Low

0.95%

3.92%

7.85%

3.79%

2.17%

18.67%

Average

1.49%

5.82%

10.83%

5.28%

2.57%

25.98%

High

0.68%

5.41%

3.79%

2.57%

0.68%

13.13%

Grand Total

5.68%

25.30% 39.38% 20.84% 8.80%

100.00%

(b)

For the large-cap funds, the three-star rating category had the highest percentage of funds, followed by two-star, four-star, five-star, and one-star. Very few large-cap funds had ratings of five. This pattern was also seen with the mid-cap funds as a group. The same pattern was observed with the small-cap funds. However, the pattern was more subtle in that the differences in percentage were less in many cases. Within the large-cap fund category, the highest percentage of funds were in the averagerisk category followed by the low-risk and high-risk categories. Within the mid-cap category, the highest percentage of funds were in the average-risk category followed by the high and low risk categories. Within the small-cap category, the highest percentage of funds were in the high-risk category followed by the average and low risk categories.

Copyright ©2024 Pearson Education, Inc.


2.59 cont.

(c) Average of 3Yr Return

(d)

2.60

Column Labels

Row Labels

One

Two

Three

Four

Five

Grand Total

Small

6.83

10.63

13.44

13.68

20.07

13.28

Low

4.00

8.85

12.82

13.56

20.90

12.82

Average

8.49

10.73

13.90

14.97

20.39

14.04

High

6.71

11.08

13.43

13.21

19.44

13.04

Mid-Cap

9.30

12.35

14.32

15.97

18.78

14.09

Low

11.46

10.43

14.39

16.43

39.17

14.10

Average

8.49

12.89

14.67

15.86

16.35

14.42

High

8.62

13.35

13.29

15.81

16.15

13.56

Large

9.21

13.80

15.78

17.48

21.59

15.79

Low

10.97

14.61

15.80

17.38

23.22

16.49

Average

8.77

12.98

15.69

18.10

20.98

15.70

High

7.69

14.08

16.00

16.39

18.73

15.00

Grand Total

8.84

12.92

14.95

16.41

20.83

14.91

There are 28 high-risk large-cap funds with a three-star rating. Their average three-year return is 16.00.

(a) Count of Type

Column Labels

Row Labels

One

Two

Five

Grand Total

Growth

3.11%

13.94% 22.60% 11.77% 4.47%

55.89%

Low

1.08%

3.38%

6.63%

3.38%

1.62%

16.10%

Average

1.22%

4.74%

10.01%

5.82%

1.62%

23.41%

Three

Copyright ©2024 Pearson Education, Inc.

Four


(b)

High

0.81%

5.82%

5.95%

2.57%

1.22%

16.37%

Value

2.57%

11.37% 16.78%

9.07%

4.33%

44.11%

Low

0.41%

2.84%

5.82%

2.44%

1.08%

12.58%

Average

0.95%

4.60%

7.04%

3.25%

2.17%

18.00%

High

1.22%

3.92%

3.92%

3.38%

1.08%

13.53%

Grand Total

5.68%

25.30% 39.38% 20.84% 8.80%

100.00%

Both the growth and value funds are most likely to have a star rating of three followed by a rating of two and then four. Both the growth and value funds are most likely to have average risk followed equally by low and high risk.

Copyright ©2024 Pearson Education, Inc.


2.60 cont.

(c) Average of 3Yr Return

(d)

Column Labels

Row Labels

One

Two

Three

Four

Five

Grand Total

Growth

8.94

14.29

16.93

18.32

24.90

16.76

Low

11.76

14.51

17.34

18.38

27.95

17.66

Average

8.21

13.81

17.19

18.55

24.95

16.91

High

6.28

14.56

16.04

17.71

20.74

15.64

Value

8.71

11.24

12.27

13.94

16.63

12.57

Low

7.03

10.98

12.42

14.24

17.25

12.69

Average

9.30

11.48

12.36

14.45

16.91

12.90

High

8.81

11.15

11.90

13.24

15.47

12.03

Grand Total

8.84

12.92

14.95

16.41

20.83

14.91

The growth and value funds have similar patterns in terms of star rating and risk. Both growth and value funds have more funds with a rating of three. Very few funds have ratings of five. The growth and value funds have similar patterns in terms of market cap with many more large cap funds than mid-cap or small funds.

Copyright ©2024 Pearson Education, Inc.


2.61

(a)

Pivot table of tallies in terms of % of grand total: Count of Star Rating

Column Labels

Row Labels

One

Two

Five

Grand Total

Growth

3.11%

13.94% 22.60% 11.77% 4.47%

55.89%

Small

0.41%

2.71%

6.22%

1.62%

1.49%

12.45%

Low

0.00%

0.41%

0.81%

0.41%

0.27%

1.89%

Average

0.14%

0.68%

2.03%

0.27%

0.54%

3.65%

High

0.27%

1.62%

3.38%

0.95%

0.68%

6.90%

Mid-Cap

0.95%

3.52%

4.74%

3.92%

0.68%

13.80%

Low

0.41%

0.81%

1.62%

0.95%

0.14%

3.92%

Average

0.27%

1.49%

2.57%

2.30%

0.14%

6.77%

High

0.27%

1.22%

0.54%

0.68%

0.41%

3.11%

Large

1.76%

7.71%

11.64%

6.22%

2.30%

29.63%

Low

0.68%

2.17%

4.19%

2.03%

1.22%

10.28%

Average

0.81%

2.57%

5.41%

3.25%

0.95%

12.99%

High

0.27%

2.98%

2.03%

0.95%

0.14%

6.36%

2.57%

11.37% 16.78%

9.07%

4.33%

44.11%

Small

0.54%

1.62%

3.11%

2.03%

0.68%

7.98%

Low

0.14%

0.27%

1.08%

0.41%

0.14%

2.03%

Average

0.14%

0.54%

0.68%

0.54%

0.27%

2.17%

High

0.27%

0.81%

1.35%

1.08%

0.27%

3.79%

Mid-Cap

0.68%

2.30%

2.84%

1.62%

0.54%

7.98%

Low

0.00%

0.81%

1.08%

0.27%

0.00%

2.17%

Average

0.14%

0.81%

0.95%

0.68%

0.27%

2.84%

High

0.54%

0.68%

0.81%

0.68%

0.27%

2.98%

Value

Three

Copyright ©2024 Pearson Education, Inc.

Four


Large

1.35%

7.44%

10.83%

5.41%

3.11%

28.15%

Low

0.27%

1.76%

3.65%

1.76%

0.95%

8.39%

Average

0.68%

3.25%

5.41%

2.03%

1.62%

12.99%

High

0.41%

2.44%

1.76%

1.62%

0.54%

6.77%

Grand Total

5.68%

25.30% 39.38% 20.84% 8.80%

100.00%

(b)

For growth funds, most are rated as three-star followed by two-star, four-star, five-star and one-star. Among the growth funds, large cap and small cap had the same pattern of star rating as observed for growth funds in general. Mid-cap funds most were rated as three-star followed by four-star, two-star, one-star, and five-star. The pattern of starrating is different among the various risk levels within the large-cap, mid-cap and smallcap growth funds. For value funds, most are rated as three-star followed by two-star, four-star, five-star, and one-star. Among the value funds, the pattern is the same for small-cap and large-cap funds. Mid-cap value funds have a different pattern. The pattern of star-rating is different among the various risk levels within the large-cap, mid-cap and small-cap funds.

(c)

The tables in 2.58 through 2.60 are easier to interpret because they contain fewer fields. The table in 2.61 tallies star rating across three fields: market type, market cap, and risk level. Problems 2.58 through 2.60 tally star rating across two fields. Problem 2.60 reveals that most value funds are rated as low-risk followed by average-risk and high-risk. Problem 2.61 reveals that this is only the case among large-cap value funds. Most mid-cap value funds are rated as average-risk followed by low-risk and highrisk. Most small-cap value funds are rated as average-risk followed by high-risk and lowrisk. Problem 2.61 also reveals that among small-cap funds rated as average-risk, most are rated as four-star, followed by three-star and two-star. Because Problem 2.61 includes four fields compared to three fields included in problems 2.58 through 2.60, additional patterns can be observed.

2.61 cont.

(d)

2.62

(a)

Pivot Table in terms of % Count of Type

Row Labels

Column Labels

One

Two

Three

Four

Five

Grand Total

Growth

2.08%

9.30% 15.09%

7.86% 2.98%

37.31%

Small

0.27%

1.81%

4.16%

1.08% 0.99%

8.31%

Mid-Cap

0.63%

2.35%

3.16%

2.62% 0.45%

9.21%

Large

1.17%

5.15%

7.77%

4.16% 1.54%

19.78%

1.72%

7.59% 11.20%

6.05% 2.89%

29.45%

Value

Copyright ©2024 Pearson Education, Inc.


Small

0.36%

1.08%

2.08%

1.36% 0.45%

5.33%

Mid-Cap

0.45%

1.54%

1.90%

1.08% 0.36%

5.33%

Large

0.90%

4.97%

7.23%

3.61% 2.08%

18.79%

2.80%

8.31% 12.01%

7.77% 2.35%

33.24%

Small

0.63%

2.98%

2.89%

2.26% 0.36%

9.12%

Mid-Cap

0.54%

2.17%

2.17%

1.08% 0.63%

6.59%

Large

1.63%

3.16%

6.96%

4.43% 1.36%

17.52%

6.59% 25.20% 38.30% 21.68% 8.22%

100.00%

Blend

Grand Total (b)

The growth and value funds have similar patterns in terms of star rating and type. Each of the categories of funds have more funds with a star rating of three. Very few funds have ratings of five. Each of the categories of funds have similar patterns in terms of market cap with many more large cap funds than mid-cap or small funds.

Copyright ©2024 Pearson Education, Inc.


2.62 cont.

(c)

Pivot Table in terms of Average Three-Year Return

Average of 3Yr Return

Row Labels

One

Two

Three

Four

Five

Grand Total

Growth

8.94

14.29

16.93

18.32

24.90

16.76

Small

7.86

10.79

14.35

15.10

22.75

14.47

Mid-Cap

9.41

13.03

15.69

16.69

20.92

15.12

Large

8.94

16.10

18.82

20.18

27.45

18.48

8.71

11.24

12.27

13.94

16.63

12.57

Small

6.06

10.37

11.62

12.54

14.16

11.44

Mid-Cap

9.15

11.33

12.04

14.23

16.10

12.31

Large

9.55

11.40

12.52

14.38

17.26

12.97

8.37

12.11

14.69

15.99

18.07

14.05

Small

-0.35

9.53

11.61

12.77

12.81

10.44

Mid-Cap

9.84

11.53

12.53

13.72

16.24

12.53

Large

11.27

14.94

16.63

18.19

20.33

16.51

Grand Total

8.64

12.66

14.86

16.26

20.04

14.63

Value

Blend

(d)

2.63

Column Labels

The average 3-year return is directly related to star rating for all categories of funds, with high rated funds having a higher 3-year average rate of return. For blend and growth funds, the average three-year rate of return is higher for large funds.

(a)

Copyright ©2024 Pearson Education, Inc.


Row Labels

2.63 cont.

Average of Assets

Average of SD

Low

6588.208679

21.21665094

Average

5318.661634

21.21418301

High

3310.876063

22.17891403

Grand Total

5082.428024

21.50339648

(b)

Row Labels Growth

Average of SD 21.67723971

Average of Assets 5346.211598

Copyright ©2024 Pearson Education, Inc.


(c)

2.64

Value

21.28315951

4748.248221

Grand Total

21.50339648

5082.428024

The results from (a) reveal that the average of SD increases as the risk level increases while average of assets decreases as risk level increases. The results from (b) reveal that the average of SD is higher for growth funds compared to value funds. The patterns suggest that value funds are likely to be associated with less risk because the average of SD was lower among value funds and low risk funds.

Funds 1092, 1107, 1101, 782, and 259 have the lowest five-year return.

Copyright ©2024 Pearson Education, Inc.


2.65

(a)

Row Labels

Average of YTD Return

Average of 10Yr Return

Small

-8.356092715

11.58589404

Mid-Cap

-8.220496894

11.99987578

Large

-5.908758782

12.99564403

Grand Total

-6.912462788

12.49064953

(b)

Row Labels Small

Average of YTD Return -8.356092715

Average of 5Yr Return 11.94039735

Copyright ©2024 Pearson Education, Inc.


Mid-Cap

-8.220496894

12.47677019

Large

-5.908758782

13.85990632

Grand Total

-6.912462788

13.16635995

Copyright ©2024 Pearson Education, Inc.


2.65 cont.

(c)

(d)

For the 1-year versus 10-year return chart, the 10-year returns are much higher than the 1-year returns with similar 5-year returns near 12 percent for all three market cap categories. For the 1-year versus 5-year chart, the returns are all higher for the 5-year returns compared to the 1-year returns. The 5-year returns are slightly higher than the 10year returns. Because the average 5-year returns were all higher than the 10-year returns for all market cap categories, one can conclude that the returns were lower in years 6 through 10. Without annual data, one cannot conclude that this was due to consistent lower returns across the years or the result of one or two years with lower returns.

2.66

The five funds with the lowest five-year return have (1) Large cap, Growth, High risk, One-star rating, (2) Large cap, Growth, Average risk, One-star rating, (3) Large cap, Growth, Average risk, One-star rating, (4) Small cap, Value, High risk, One-star rating, and (5) Small cap, Value, Low risk, Two-star rating.

2.67

(a) Year 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022

2.68

(b)

The sparklines reveal that a general trend upward in home prices during the years 2000, 2003-2005, 2012-2017, 2019-2021. There was a general trend downward during the years 2007, 2008, and 2011.

(c)

In the Time-series plot one can see an upward trend in home sales price until 2006. Prices decline or remain flat from 2006 – 2011. From 2011 – 2016 there is an upward trend in median price of new home sales.

(a) Copyright ©2024 Pearson Education, Inc.


Year 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022

(b)

There has been a slight decline in the price of natural gas over time until 2020, when prices greatly increased. Generally, the price is highest in the middle of the year.

2.69

Student project answers will vary

2.70

Student project answers will vary

2.71

(a) (b) (c)

There is a title. None of the axes are labeled.

Copyright ©2024 Pearson Education, Inc.


2.71 cont.

(c)

2.72

(a) (b) (c)

There is a title. The simplest possible visualization is not used.

Copyright ©2024 Pearson Education, Inc.


2.72 cont.

(c)

2.73

(a) (b) (c)

2.74

Answers will vary depending on selection of source.

None. The use of a 3D graph.

Copyright ©2024 Pearson Education, Inc.


2.75

2.76

(a)

―Flexibility Breeds Productivity Increases‖

(b)

The bar chart and the pie chart should be preferred over the exploded pie chart, the cone chart, and the pyramid chart since the former set is simpler and easier to interpret.

(a) Copyright ©2024 Pearson Education, Inc.


(b)

The bar chart and the pie chart should be preferred over the exploded pie chart, the cone chart, and the doughnut chart since the former set is simpler and easier to interpret.

Copyright ©2024 Pearson Education, Inc.


2.77

A histogram uses bars to represent each class while a polygon uses a single point. The histogram should be used for only one group, while several polygons can be plotted on a single graph.

2.78

A summary table allows one to determine the frequency or percentage of occurrences in each category.

2.79

A bar chart is useful for comparing categories. A pie chart is useful when examining the portion of the whole that is in each category. A Pareto diagram is useful in focusing on the categories that make up most of the frequencies or percentages.

2.80

The bar chart for categorical data is plotted with the categories on the vertical axis and the frequencies or percentages on the horizontal axis. In addition, there is a separation between categories. The histogram is plotted with the class grouping on the horizontal axis and the frequencies or percentages on the vertical axis. This allows one to more easily determine the distribution of the data. In addition, there are no gaps between classes in the histogram.

2.81

A time-series plot is a type of scatter diagram with time on the x-axis.

2.82

Because the categories are arranged according to frequency or importance, it allows the user to focus attention on the categories that have the greatest frequency or importance.

2.83

Percentage breakdowns according to the total percentage, the row percentage, and/or the column percentage allow the interpretation of data in a two-way contingency table from several different perspectives.

2.84

A contingency table contains information on two categorical variables whereas a multidimensional table can display information on more than two categorical variables.

2.85

The multidimensional PivotTable can reveal additional patterns that cannot be seen in the contingency table. One can also change the statistic displayed and compute descriptive statistics which can add insight into the data.

2.86

In a PivotTable in Excel, double-clicking a cell drills down and causes Excel to display the underlying data in a new worksheet to enable you to then observe the data for patterns. In Excel, a slicer is a panel of clickable buttons that appears superimposed over a worksheet to enable you to work with many variables at once in a way that avoids creating an overly complex multidimensional contingency table that would be hard to comprehend and interpret.

2.87

Sparklines are compact time-series visualizations of numerical variables. Sparklines can also be used to plot time-series data using smaller time units than a time-series plot to reveal patterns that the time-series plot may not.

Copyright ©2024 Pearson Education, Inc.


2.88

(a)

Copyright ©2024 Pearson Education, Inc.


2.88 cont.

(a)

(b)

(c)

The publisher has the largest portion (66.06%) of the cost. 24.93% is editorial production manufacturing costs. The publisher’s marketing accounts for the next largest share of the revenue, at 11.60%. Author and bookstore personnel each account for around 11 to 12% of the cost, whereas the publisher profit accounts for more than 22% of the cost. Yes, the publisher’s profit cost has almost twice the cost of the authors.

Copyright ©2024 Pearson Education, Inc.


2.89

(a)

Number of Movies

Copyright ©2024 Pearson Education, Inc.


2.89 cont.

(a)

Copyright ©2024 Pearson Education, Inc.


2.89 cont.

(a)

Gross

Copyright ©2024 Pearson Education, Inc.


2.89 cont.

(a)

Gross

Copyright ©2024 Pearson Education, Inc.


2.89 cont.

(a)

Tickets

Copyright ©2024 Pearson Education, Inc.


2.89 cont.

(a)

Tickets Sold

(b)

Based on the Pareto chart for the number of movies, Drama, Comedy, Action, and Thriller/Suspense are the ―vital few‖ and capture about 72% of the market share. According to the Pareto chart for gross (in $millions) and the Pareto chart for number of tickets sold (in millions), Adventure, Action, Drama, and Thriller/Suspense are the ―vital few‖ and capture about 82% of the market share.

Copyright ©2024 Pearson Education, Inc.


2.90

(a)

Cybersecurity 1

2.90

(a)

Cybersecurity 1 Copyright ©2024 Pearson Education, Inc.


cont.

Cybersecurity 2

2.90 cont.

(a)

Cybersecurity 2

Copyright ©2024 Pearson Education, Inc.


2.91

(b)

A bar chart would be best for both tables because each table contains a pair of categories that have similar percentages.

(c)

Almost all small business owners have a concern about cybersecurity and most assign someone the primary responsibility for online security.

(a) Type of Entrée Percentage Number Ordered Beef 29.68% 187 Chicken 16.35% 103 Mixed 4.76% 30 Duck 3.97% 25 Fish 19.37% 122 Pasta 10.00% 63 Vegan 11.75% 74 Veal 4.13% 26 Total 100.00% 630

2.91 cont.

(b)

Copyright ©2024 Pearson Education, Inc.


2.92

(c)

The Pareto diagram has the advantage of offering the cumulative percentage view of the categories and, hence, enables the viewer to separate the ―vital few‖ from the ―trivial many‖.

(d)

Beef and fish account for nearly 50% of all entrees ordered by weekend patrons of a continental restaurant. When chicken is included, nearly two-thirds of the entrees are accounted for.

(a) Dessert

Reservation Copyright ©2024 Pearson Education, Inc.


Ordered Yes No Total

Yes 66% 48% 52%

No 34% 52% 48%

Total 100% 100% 100%

Dessert Ordered Yes No Total

Reservation Yes No 29% 34% 71% 52% 100% 48%

Total 100% 100% 100%

Dessert Ordered Yes No Total

Reservation Yes No 15% 8% 37% 40% 52% 48%

Total 23% 77% 100%

Dessert Ordered Yes No Total

Beef Entree Yes No 52% 48% 23% 77% 30% 70%

Total 100% 100% 100%

Dessert Ordered Yes No Total

Beef Entree Yes No 40% 15% 60% 85% 100% 100%

Total 23% 77% 100%

Dessert Ordered Yes No Total

Beef Entree Yes No 11.75% 10.79% 19.52% 57.94% 31.27% 68.73%

Total 22.54% 77.46% 100%

(b)

If the owner is interested in finding out the percentage of those with a reservation who order dessert or the percentage of those who order a beef entrée and a dessert among all patrons, the table of total percentages is most informative. If the owner is interested in the effect of reservation on ordering of dessert or the effect of ordering a beef entrée on the ordering of dessert, the table of column percentages will be most informative. Because dessert is usually ordered after the main entrée, and the owner has no direct control over the reservation planning of patrons, the table of row percentages is not very useful here.

(c)

29% of those with reservations ordered desserts, compared to 17% of those without. Almost 40% of the patrons ordering a beef entrée ordered dessert, compared to 16% of patrons ordering all other entrées. Patrons ordering beef are more than 2.5 times as likely to order dessert as patrons ordering any other entrée.

Copyright ©2024 Pearson Education, Inc.


2.93

(a)

United States Fresh Food Consumed:

Copyright ©2024 Pearson Education, Inc.


2.93 cont.

(a)

Japan Fresh Food Consumed:

Copyright ©2024 Pearson Education, Inc.


2.93 cont.

(a)

Russia Fresh Food Consumed:

Copyright ©2024 Pearson Education, Inc.


2.93 cont.

(b)

United States Packaged Food Consumed:

2.93

(b)

Japan Packaged Food Consumed: Copyright ©2024 Pearson Education, Inc.


cont.

2.93 cont.

(b)

Russian Packaged Food Consumed: Copyright ©2024 Pearson Education, Inc.


2.93 cont.

(c)

The fresh food consumption patterns between Japanese and Russians are quite similar with vegetables taking up the largest share followed by meats and seafood while Americans consume about the same amount of meats and seafood, and vegetables. Copyright ©2024 Pearson Education, Inc.


Among the three countries, vegetables, and meats and seafood constitute more than 60% of the fresh food consumption. For Americans, dairy products, and processed, frozen, dried and chilled food and readyto-eat meals make up slightly more than 60% of the packaged food consumption. For Japanese, processed, frozen, dried and chilled food, and ready-to-eat meals, and dairy products constitute more than 60% of their packaged food consumption. For the Russians, bakery goods and dairy products take up 60% of the share of their package food consumption. 2.94

(a)

Most complaints were against U.S. airlines. (b)

Copyright ©2024 Pearson Education, Inc.


2.94 cont.

(b)

Most of the complaints were due to refunds and flight problems. 2.95

(a) Range 0 but less than 25 25 but less than 50 50 but less than 75 75 but less than 100 100 but less than 125 125 but less than 150 150 but less than 175

Frequency 17 19 5 2 3 2 2

Percentage 34% 38% 10% 4% 6% 4% 4%

Copyright ©2024 Pearson Education, Inc.


(b) Histogram

Frequency

2.95 cont.

20 18 16 14 12 10 8 6 4 2 0 0 but less 25 but 50 but 75 but 100 but 125 but 150 but than 25 less than less than less than less than less than less than 50 75 100 125 150 175 Days

Percentage Polygon 40% 35% 30% 25% 20%

15% 10% 5% 0% ---

0.53

0.77

0.84

0.89

0.94

0.98

(c) Range 0 but less than 25 25 but less than 50 50 but less than 75 75 but less than 100 100 but less than 125 125 but less than 150 150 but less than 175

Cumulative % 34% 72% 82% 86% 92% 96% 100%

Copyright ©2024 Pearson Education, Inc.

1


2.95 cont.

(c) Cumulative Percentage Polygon 120% 100% 80% 60% 40% 20% 0% -0.01

(d)

2.96

24.99 49.99 74.99 99.99 124.99 149.99 174.99

You should tell the president of the company that over half of the complaints are resolved within a month, but point out that some complaints take as long as three or four months to settle.

(a)

Copyright ©2024 Pearson Education, Inc.


2.96 cont.

(a)

(b)

2.96

(b) Copyright ©2024 Pearson Education, Inc.


cont.

2.97

(c)

The alcohol percentage is concentrated between 4% and 6%, with more between 4% and 5%. The calories are concentrated between 140 and 160. The carbohydrates are concentrated between 12 and 15. There are outliers in the percentage of alcohol in both tails. There are a few beers with alcohol content as high as around 11.5%. There are a few beers with calorie content as high as around 330 and carbohydrates as high as 32.1. There is a positive relationship between percentage of alcohol and calories and between calories and carbohydrates, and there is a moderately positive relationship between percentage alcohol and carbohydrates.

(a)

Ordered array of ratings of Super Bowl ads that ran before halftime 4.54.75.05.05.25.25.35.35.35.3 Copyright ©2024 Pearson Education, Inc.


5.35.45.55.85.85.96.06.16.26.2 6.36.46.46.56.76.76.77.4 Ordered array of ratings of Super Bowl ads that ran at or after halftime 4.04.34.74.84.84.94.95.25.25.3 5.35.35.45.55.65.65.75.75.85.8 5.95.96.06.06.16.36.56.77.3 (b)

Stem-and-leaf display of ratings of Super Bowl ads that ran before halftime Stem-and-Leaf Display Stem unit:

1 4

57

5

00223333345889

6

01223445777

7

4

Stem-and-leaf display of Super Bowl ads that ran at or after halftime Stem-and-Leaf Display Stem unit:

1

4

0378899

5

223334566778899

6

001357

7

3

(c)

Copyright ©2024 Pearson Education, Inc.


Copyright ©2024 Pearson Education, Inc.


2.97 cont.

(c)

(d)

Copyright ©2024 Pearson Education, Inc.


2.97 cont.

2.98

(e)

(f)

There was just one ad that was run at or after halftime compared to before halftime. Approximately 61% of the ads run before halftime had a rating of 6 or less, while 83% of the ads run at or after halftime had a rating of 6 or lower. Seven ads run at or after halftime had a rating of less than 5, while only two ads run before halftime had a rating less than 5. One ad run before halftime had a rating of more than 7, and one ad run at after halftime had a rating of more than 7.

(a)

Stem-and -leaf of One-Year CD Stem unit:

0.1

0

2333355555

1

0000000035555

2

00055557

3

5

4 5

5555

Copyright ©2024 Pearson Education, Inc.


2.98 cont.

(a)

Stem-and -leaf of Five-Year CD Stem unit:

0.1

0

233335

1

555

2

000055

3

0000

4

0078

5

00005

6

000

7 8

005

9 10 (b)

00

The yield of one-year CDs shows that most values are at less than 3.0. The yield of fiveyear CDs shows that most values are at least 3.0.

(c)

Copyright ©2024 Pearson Education, Inc.


(d)

There appears to be a positive relationship between the yield of the one-year CD and the five-year CD.

Copyright ©2024 Pearson Education, Inc.


2.99

(a) Download Speed (Mbps) 600 but less than 700 700 but less than 800 800 but less than 900 900 but less than 1000 1000 but less than 1100 1100 but less than 1200 1200 but less than 1300 1300 but less than 1400 1400 but less than 1500 1500 but less than 1600 1600 but less than 1700 1700 but less than 1800 1800 but less than 1900 1900 but less than 2000

Frequency 13 49 28 9 0 0 0 0 0 0 0 0 0 1

Percentage 13.00% 49.00% 28.00% 9.00% 0% 0% 0% 0% 0% 0% 0% 0% 0% 1.00%

(b)

Copyright ©2024 Pearson Education, Inc.


2.99 cont.

(b)

(c) Download Speed (Mbps) 600 but less than 700 700 but less than 800 800 but less than 900 900 but less than 1000 1000 but less than 1100 1100 but less than 1200 1200 but less than 1300 1300 but less than 1400 1400 but less than 1500 1500 but less than 1600 1600 but less than 1700 1700 but less than 1800 1800 but less than 1900 1900 but less than 2000

Cumulative Frequency Percentage Percentage 13 13.00% 13.00% 49 49.00% 62.00% 28 28.00% 90.00% 9 9.00% 99.00% 99.00% 0 0% 99.00% 0 0% 99.00% 0 0% 99.00% 0 0% 99.00% 0 0% 99.00% 0 0% 99.00% 0 0% 99.00% 0 0% 99.00% 0 0% 100.00% 1 1.00%

Copyright ©2024 Pearson Education, Inc.


2.99 cont.

(c)

(d)

Approximately 90% of the cities have download speeds from 600 to 900 Mbps.

(e)

(f)

There does not appear to be any relationship between download speed and number of providers.

Copyright ©2024 Pearson Education, Inc.


2.100

(a) Frequencies (Boston) Weight (Boston) 3015 but less than 3050 3050 but less than 3085 3085 but less than 3120 3120 but less than 3155 3155 but less than 3190 3190 but less than 3225 3225 but less than 3260 3260 but less than 3295

Frequency 2 44 122 131 58 7 3 1

Percentage 0.54% 11.96% 33.15% 35.60% 15.76% 1.90% 0.82% 0.27%

(b) Frequencies (Vermont) Weight (Vermont) 3550 but less than 3600 3600 but less than 3650 3650 but less than 3700 3700 but less than 3750 3750 but less than 3800 3800 but less than 3850 3850 but less than 3900

Frequency 4 31 115 131 36 12 1

Percentage 1.21% 9.39% 34.85% 39.70% 10.91% 3.64% 0.30%

Copyright ©2024 Pearson Education, Inc.


2.100 cont.

(c)

(d)

0.54% of the ―Boston‖ shingles pallets are underweight while 0.27% are overweight. 1.21% of the ―Vermont‖ shingles pallets are underweight while 3.94% are overweight.

Copyright ©2024 Pearson Education, Inc.


2.101

(a)

Not member of major conference Total Pay ($000)

Frequency Percentage

400 but less than 500

4

8.00%

500 but less than 1000

27

54.00%

1000 but less than 1500

6

12.00%

1500 but less than 2000

7

14.00%

2000 but less than 2500

4

8.00%

2500 but less than 3000

1

2.00%

3000 but less than 3500

1

2.00%

50

100.00%

Member of major conference Total Pay ($000)

Frequency Percentage

700 but less than 1000

5

7.69%

1000 but less than 2000

4

6.15%

2000 but less than 3000

9

13.85%

3000 but less than 4000

13

20.00%

4000 but less than 5000

16

24.62%

5000 but less than 6000

11

16.92%

6000 but less than 7000

2

3.08%

7000 but less than 8000

2

3.08%

8000 but less than 9000

2

3.08%

9000 but less than 10,000

1

1.54%

65

100.00%

(b)

Copyright ©2024 Pearson Education, Inc.


Copyright ©2024 Pearson Education, Inc.


2.101 cont.

(b)

2.101

(c)

Not member of major conference Copyright ©2024 Pearson Education, Inc.


cont. Total Pay ($000)

Frequency Percentage

Cumulative Percentage

400 but less than 500

4

8.00%

8.00%

500 but less than 1000

27

54.00%

62.00%

1000 but less than 1500

6

12.00%

74.00%

1500 but less than 2000

7

14.00%

88.00%

2000 but less than 2500

4

8.00%

96.00%

2500 but less than 3000

1

2.00%

98.00%

3000 but less than 3500

1

2.00%

100.00%

50

100.00%

Member of major conference Total Pay ($000)

Frequency Percentage

Cumulative Percentage

700 but less than 1000

5

7.69%

7.69%

1000 but less than 2000

4

6.15%

13.85%

2000 but less than 3000

9

13.85%

27.69%

Copyright ©2024 Pearson Education, Inc.


2.101 cont.

3000 but less than 4000

13

20.00%

47.69%

4000 but less than 5000

16

24.62%

72.31%

5000 but less than 6000

11

16.92%

89.23%

6000 but less than 7000

2

3.08%

92.31%

7000 but less than 8000

2

3.08%

95.38%

8000 but less than 9000

2

3.08%

98.46%

9000 but less than 10,000

1

1.54%

100.00%

65

100.00%

(c)

(d)

Coaches Pay – Yes Major Conference Member Stem unit:

1000

0

78889

1

3369

2

4788999

3

000001224455889

Copyright ©2024 Pearson Education, Inc.


4

0000002233444488

5

00002355667

6

15

7

67

8

49

9

0

Copyright ©2024 Pearson Education, Inc.


2.101 cont.

(d)

Coaches Pay – Not a Major Conference Member Stem unit:

100 4

348

5

01444

6

388

7

001355578889

8

0003347

9

1

10

059

11 12 13

058

14 15

00459

16 17 18

05

19 20

0

21

0

22 23

12

24 25 26 Copyright ©2024 Pearson Education, Inc.


27 28

8

29 30 31 32 33 34 (e)

0

The majority of coaches in the non-major conference earn under 1 million dollars while the majority coaches in a major conference earn between 3 and 6 million dollars per year. The highest paid coach in a major conference is $9,013,000 while the highest paid coach in the non-major conference is $3,400,000.

Copyright ©2024 Pearson Education, Inc.


2.102

(a) Calories 50 up to 100 100 up to 150 150 up to 200 200 up to 250 250 up to 300 300 up to 350 350 up to 400

Frequency 3 3 9 6 3 0 1

Percentage 12% 12 36 24 12 0 4

Percentage Less Than 12% 24 60 84 96 96 100

Cholesterol 0 up to 50 50 up to 100 100 up to 150 150 up to 200 200 up to 250 250 up to 300 300 up to 350 350 up to 400 400 up to 450 450 up to 500

Frequency 2 17 4 1 0 0 0 0 0 1

Percentage 8 68 16 4 0 0 0 0 0 4

Percentage Less Than 8% 76 92 96 96 96 96 96 96 100

(b)

Copyright ©2024 Pearson Education, Inc.


2.102 cont.

(b)

(c)

2.103

The sampled fresh red meats, poultry, and fish vary from 98 to 397 calories per serving, with the highest concentration between 150 to 200 calories. One protein source, spareribs, with 397 calories, is more than 100 calories above the next highest caloric food. The protein content of the sampled foods varies from 16 to 33 grams, with 68% of the data values falling between 24 and 32 grams. Spareribs and fried liver are both very different from other foods sampled—the former on calories and the latter on cholesterol content.

(a)

Copyright ©2024 Pearson Education, Inc.


2.103 cont.

(b)

The commercial average price was highest in the fall of 2021. The residential average price of natural gas in the United States is higher in the summer in general and the highest in summer of 2021.

(c)

(d)

2.104

There appears to be a slight positive relationship between the commercial price and residential price.

(a)

Amount 2.15

2.1

2.05

2

1.95

1.9

1.85 0

2.104

(b)

10

20

30

40

There is a downward trend in the amount filled. Copyright ©2024 Pearson Education, Inc.

50

60


cont.

(c) (d)

2.105

(a)

The amount filled in the next bottle will most likely be below 1.894 liter. The scatter plot of the amount of soft drink filled against time reveals the trend of the data, whereas a histogram only provides information on the distribution of the data.

Copyright ©2024 Pearson Education, Inc.


2.105 cont.

(b)

(c)

The Japanese yen had depreciated against the U.S. dollar since 1985 while the Canadian dollar appreciated gradually from 1980 to 1987 and from 1991 to 2002 and then started to depreciate until 2011. The English pound to U.S. dollar’s exchange rate has been quite stable since 1983. The U.S. dollar has appreciated against the Japanese yen since 1980 and appreciated against the Canadian dollar since 2002 in general while the exchange rate against the English bound has been stable in general.

(d)

2.105

(d) Copyright ©2024 Pearson Education, Inc.


cont.

(e)

2.106

There is not any obvious relationship between the Canadian dollar and Japanese yen in terms of the U.S. dollar nor any relationship between the Japanese yen and English pound. There is a slightly positive relationship between the Canadian dollar and English pound.

(a) Variations Original Call to Action Button New Call to Action Button

Percentage of Download 9.64% 13.64%

Copyright ©2024 Pearson Education, Inc.


2.106 cont.

(b) Bar Chart

Variations

Original Call to Action Button

(c)

16.00%

14.00%

12.00%

10.00%

8.00%

6.00%

4.00%

2.00%

0.00%

New Call to Action Button

The New Call to Action Button has a higher percentage of downloads at 13.64% when compared to the Original Call to Action Button with a 9.64% of downloads.

(d) Variations Original web design New web design

Percentage of Download 8.90% 9.41%

(e)

Bar Chart

Variations

Original web design

New web design

0.00%

2.106

(f)

2.00%

4.00%

6.00%

8.00%

10.00%

The New web design has only a slightly higher percentage of downloads at 9.41% when Copyright ©2024 Pearson Education, Inc.


cont. (g)

compared to the Original web design with an 8.90% of downloads. The New web design is only slightly more successful than the Original web design while the New Call to Action Button is much more successful than the Original Call to Action Button with about 41% higher percentage of downloads.

(h) Call to Action Button Old New Old New (i)

(j)

Web Design Old Old New New

Percentage of Download 8.30% 13.70% 9.50% 17.00%

The combination of the New Call to Action Button and the New web design results in slightly more than twice as high a percentage of downloads than the combination of the Old Call to Action Button and Old web design. The New web design is only slightly more successful than the Original web design while the New Call to Action Button is much more successful than the Original Call to Action Button with about 41% higher percentage of downloads. However, the combination of the New Call to Action Button and New web design results in more than twice as high a percentage of downloads than the combination of the Old Call to Action Button and Old web design.

2.107

Class project – answers will vary depending on student responses.

2.108

Class project – answers will vary depending on student responses.

Copyright ©2024 Pearson Education, Inc.


2.109

A descriptive analysis of the weight of the pallets of the Boston shingles revealed that the average weight was 3124.2 pounds with a standard deviation of 34.7. The average weight of 3124.2 pounds was 74.2 pounds above the expected minimum weight of 3,050 pounds. An analysis of the Vermont shingles revealed that the average weight was 3704.0 pounds with a standard deviation of 46.7. The average weight of 3704.0 pounds was 104 pounds above the expected minimum weight of 3,600 pounds. The below table includes a number of descriptive statistics for the two shingle types.

A frequency distribution of the Boston shingles revealed that 0.54% of the pallets were underweight and 0.27% were overweight. A frequency distribution of the Vermont shingles revealed that 1.21% of the shingles were underweight and 3.94% were overweight. The complete results are provided in the below frequency distributions.

Histogram graphs of the Boston shingles and the Vermont shingles, shown below, revealed that the weights of the pallets appeared to be consistent with a normal distribution. In both cases, there was slight right skewness with the Boston shingles having slightly more right skewness than the Vermont shingles.

Copyright ©2024 Pearson Education, Inc.


2.109 cont.

The results of the above analyses reveal that both shingle types generally met pallet weight expectations with less than 1% of the Boston shingles weighing outside of the expected parameters and just over 5% of the Vermont shingles weighing outside of the expected parameters. The results suggest that the manufacturer should consider implementation of parameter compliance strategies for the Vermont shingles.

Chapter 3 Copyright ©2024 Pearson Education, Inc.


3.1

(a)

Excel output: X Mean Median Mode Standard Deviation Sample Variance Range Minimum Maximum Sum Count First Quartile Third Quartile Interquartile Range Coefficient of Variation

#N/A 2.915476 8.5 7 2 9 30 5 3 8.5 5.5 48.5913%

(d)

Mean = 6 Median = 7 There is no mode. Range = 7 Variance = 8.5 Standard deviation = 2.9 Coefficient of variation = (2.915/6) • 100% = 48.6% Z scores: 0.343, –0.686, 1.029, 0.686, –1.372 None of the Z scores is larger than 3.0 or smaller than –3.0. There is no outlier. Since the mean is less than the median, the distribution is left-skewed.

(a)

Excel output:

(b) (c)

3.2

6 7

X Mean Median Mode Standard Deviation Sample Variance Range Minimum Maximum Sum Count First Quartile Third Quartile Interquartile Range Coefficient of Variation

(b)

7 7 7 3.286335 10.8 9 3 12 42 6 4 9 5 46.9476%

Mean = 7 Median = 7 Mode = 7 Range = 9 Variance = 10.8 Standard deviation = 3.286 Coefficient of variation = (3.286/7) • 100% = 46.948%

Copyright ©2024 Pearson Education, Inc.


3.2 cont.

3.3

(d)

Z scores: 0, –0.913, 0.609, 0, –1.217, 1.522 None of the Z scores is larger than 3.0 or smaller than –3.0. There is no outlier. Since the mean equals the median, the distribution is symmetrical.

(a)

Excel output:

(c)

X Mean Median Mode Minimum Maximum Range Variance Standard Deviation Coefficient of Variation Skewness Kurtosis Count Standard Error First Quartile Third Quartile

(b) (c) (d) 3.4

5.85714 7 7 0 11 11 14.1429 3.7607 64.21% –0.2659 –0.6032 7 1.4214 3 9

Mean = 5.85714 Median = 7 Mode = 7 Range = 11 Variance = 14.1429 Standard deviation = 3.7607 Coefficient of variation = 64.21% Z scores: 1.37, 0.30, –0.49, 0.84, –1.56, 0.30, –0.76. There is no outlier. Since the mean is less than the median, the distribution is left-skewed. Excel output: X Mean Median Mode Standard Deviation Sample Variance Range Minimum Maximum Sum Count First Quartile Third Quartile Interquartile Range Coefficient of Variation

(a) (b)

2 7 7 7.874007874 62 17 –8 9 10 5 –6.5 8 14.5 393.7004%

Mean = 2 Median = 7 Mode = 7 Range = 17 Variance = 62 Standard deviation = 7.874 Coefficient of variation = (7.874/2) • 100% = 393.7%

Copyright ©2024 Pearson Education, Inc.


Z scores: 0.635, –0.889, –1.270, 0.635, 0.889. No outliers. Since the mean is less than the median, the distribution is left-skewed.

3.4 cont.

(c) (d)

3.5

RG  1  0.11  0.3 

3.6

RG  1  0.2 1  0.3 

1/2

1/2

 1  19.58%

 1  8.348%

3.7

Half of high school graduates with no college have a weekly income of no more than $838 while half of the workers with at least a bachelor’s degree have weekly income of no more than $1,547.

3.8

(a)

(b)

(c)

Grade X Grade Y Mean 575 575.4 Median 575 575 Standard deviation 6.4 2.1 If quality is measured by central tendency, Grade X tires provide slightly better quality because X’s mean and median are both equal to the expected value, 575 mm. If, however, quality is measured by consistency, Grade Y provides better quality because, even though Y’s mean is only slightly larger than the mean for Grade X, Y’s standard deviation is much smaller. The range in values for Grade Y is 5 mm compared to the range in values for Grade X, which is 16 mm. Excel output: Grade X

Grade Y

Mean

575

Mean

577.4

Median

575

Median

575

Mode

#N/A

Mode

#N/A

Standard Deviation

6.403124

Standard Deviation

6.107373

Sample Variance

41

Sample Variance

Range

16

Range

15

Minimum

568

Minimum

573

Maximum

584

Maximum

588

Sum

2875

Sum

2887

Count

5

Count

5

Mean Median Standard deviation

Grade X 575 575 6.4

Grade Y, Altered 577.4 575 6.1

Copyright ©2024 Pearson Education, Inc.

37.3


When the fifth Y tire measures 588 mm rather than 578 mm, Y’s mean inner diameter becomes 577.4 mm, which is larger than X’s mean inner diameter, and Y’s standard deviation increases from 2.1 mm to 6.1 mm. In this case, X’s tires are providing better quality in terms of the mean inner diameter, with only slightly more variation among the tires than Y’s. 3.9

(a) (b) (c)

Half of the new houses were sold at a price no higher than $428,700. On average, the sales price of houses was $507,800. The sales price of new houses in 2022 is right skewed.

Copyright ©2024 Pearson Education, Inc.


3.10

(a), (b)

Mean Median Mode Minimum Maximum Range Variance Standard Deviation Coeff. Of Variation Skewness Kurtosis Count Standard Error

(c) (d)

3.11

Download Speed 820.5125 733.85 #N/A 675.4 1171.3 495.9 31266.4727 176.8233 21.55% 1.4728 1.1581 8 62.5165

Upload Speed 105.925 107.8 #N/A 80.9 131.7 50.8 278.2250 16.6801 15.75% –0.0029 –0.5582 8 5.8973

The mean download speed is much higher than the median indicating right skewness, whereas the mean and median upload speeds are about the same indicating symmetry. The mean download speed is much higher than the mean upload speed. The download speeds are about 8 times faster than the upload speeds. The variation in the download speeds between cities has a coefficient of variation of 21.55%, whereas the variation in the upload speeds has a coefficient of variation of 15.75%. The kurtosis in the download speeds indicates more lack of normality than for the upload speeds.

(a), (b) Mean Median Mode Minimum Maximum Range Variance Standard Deviation Coeff. Of Variation Skewness Kurtosis Count Standard Error

First Half 5.789285714 5.8 5.3 4.5 7.4 2.9 0.4928 0.7020 12.13% 0.2392 –0.5132 28 0.1327

After Halftime 5.534482759 5.6 5.3 4 7.3 3.3 0.5031 0.7093 12.82% 0.1578 0.5515 29 0.1317

Copyright ©2024 Pearson Education, Inc.


3.11 cont.

(a), (b)

(c)

(d)

First Half

Z score

Halftime or After

Z score

4.5 4.7 5.0 5.0 5.2 5.2 5.3 5.3 5.3 5.3 5.3 5.4 5.5 5.8 5.8 5.9 6.0 6.1 6.2 6.2 6.3 6.4 6.4 6.5 6.7 6.7 6.7 7.4

–1.83652 –1.55163 –1.12429 –1.12429 –0.8394 –0.8394 –0.69696 –0.69696 –0.69696 –0.69696 –0.69696 –0.55452 –0.41207 0.015262 0.015262 0.157706 0.300151 0.442595 0.585039 0.585039 0.727484 0.869928 0.869928 1.012373 1.297261 1.297261 1.297261 2.294372

4.0 4.3 4.7 4.8 4.8 4.9 4.9 5.2 5.2 5.3 5.3 5.3 5.4 5.5 5.6 5.6 5.7 5.7 5.8 5.8 5.9 5.9 6.0 6.0 6.1 6.3 6.5 6.7 7.3

–2.16349 –1.74051 –1.17655 –1.03556 –1.03556 –0.89457 –0.89457 –0.47159 –0.47159 –0.3306 –0.3306 –0.3306 –0.18961 –0.04862 0.092374 0.092374 0.233365 0.233365 0.374356 0.374356 0.515348 0.515348 0.656339 0.656339 0.797331 1.079313 1.361296 1.643279 2.489227

The mean and median are approximately equal for the before Halftime ratings indicating the data is symmetric. The median is more than the mean for the ratings on ads running at halftime or after indicating a left or negatively skewed distribution. The mean and median are both greater for the ads running before halftime compared to the ads running at or after halftime. There is about the same amount of variation in the ratings of the ads running at or after halftime as those running in the first half.

Copyright ©2024 Pearson Education, Inc.


3.12

(a), (b) Mean Median Mode Minimum Maximum Range Variance Standard Deviation Coeff. Of Variation Skewness Kurtosis Count Standard Error

3.13

Team Value 2.074 1.73 #N/A 0.99 6 5.01 1.2997 1.1401 54.97% 1.8739 3.8080 30 0.2081

Payroll 147.2133333 141.555 #N/A 48.06 284.73 236.67 3869.0060 62.2013 42.25% 0.3460 -0.4260 30 11.3564

(c)

The team value is right skewed with the mean about 0.3 billion dollars higher than the median. The payroll is only slightly skewed with a difference of only six million dollars between the mean and median.

(d)

The mean team value is more than two billion dollars. There is a large amount of variation in the team value around the mean with a coefficient of variation of 54.97%. The kurtosis statistic for team value of 3.8 indicates departure from normality. The coefficient of variation for payroll is 42.25%. There is some negative kurtosis in payroll indicating some lack of normality.

(a), (b)

Mean Median Mode Minimum Maximum Range Variance Standard Deviation Coeff. of Variation Skewness Kurtosis Count Standard Error

Number of Partners 19.51111111 18 14 5 41 36 59.3919 7.7066 39.50% 0.8041 0.5737 45 1.1488

There are no Z-Scores greater than 3 or less than –3, which indicates there are no outliers.

Copyright ©2024 Pearson Education, Inc.


3.13 cont.

(a), (b) Number of Partners 37 41 26 14 22 29 36 11 16 29 30 20 20 20 26 21 17 21 14 28 24 14 15 19 14 11 18 9 10 13 14 24 25 13 5 13 20 15 17 16 26 18 20 16 11

Z score 2.269335 2.788369 0.84199 –0.71511 0.322955 1.231265 2.139576 –1.10439 –0.4556 1.231265 1.361024 0.063438 0.063438 0.063438 0.84199 0.193196 –0.32584 0.193196 –0.71511 1.101507 0.582472 –0.71511 –0.58536 –0.06632 –0.71511 –1.10439 –0.19608 –1.36391 –1.23415 –0.84487 –0.71511 0.582472 0.712231 –0.84487 –1.88294 –0.84487 0.063438 –0.58536 –0.32584 –0.4556 0.84199 –0.19608 0.063438 –0.4556 –1.10439

Copyright ©2024 Pearson Education, Inc.


3.13 cont.

(c) (d)

3.14

(a), (b)

Because the mean is larger than the median, the data is skewed to the right. The mean number of partners in rising accounting firms is 19.5 and half of the rising accounting firms have 18 or more partners. The average scatter around the mean is 7.71 and the lowest number of partners is 5 (Krost) and the greatest number of partners is 41 (Brady, Martz & Associates).

Mean Median Mode Minimum Maximum Range Variance Standard Deviation Coeff. of Variation Skewness Kurtosis Count Standard Error

Country Australia China India Indonesia Japan Malaysia New Zealand Philippines Saudi Arabia Singapore South Korea Taiwan Thailand Vietnam

Mobile Commerce Penetration 59.30714286 63.95 #N/A 32.1 79.1 47 199.9299 14.1397 23.84% –0.8638 –0.1033 14 3.7790 Mobile Commerce Penetration 36.4 64.3 57.3 79.1 32.1 68.6 38.9 69.6 65.1 56.9 59.9 63.8 74.2 64.1

Z score –1.6197 0.3535 –0.1416 1.4002 –1.9238 0.6576 –1.4429 0.7283 0.4101 –0.1699 0.0423 0.3181 1.0537 0.3393

There are no Z-Scores greater than 3 or less than –3, which indicates there are no outliers. (c) (d)

Because the mean is smaller than the median, the data is skewed to the left. The mean Mobile Commerce Penetration is 59.307% and half the countries have values greater than or equal to 63.95%. The average scatter around the mean is 14.1397%. The lowest value is 32.1% (Japan), and the highest value is 79.1% (Indonesia).

Copyright ©2024 Pearson Education, Inc.


3.15

(a) Mean Median Mode Minimum Maximum Range Variance Standard Deviation Coeff. of Variation Skewness Kurtosis Count Standard Error (b)

3.16

One-Year CD 0.18 0.12 0.1 0.02 0.55 0.53 0.0246 0.1570 89.12% 1.4615 1.3764 36 0.0262

Five-Year CD 0.38 0.30 0.2 0.02 1.00 0.98 0.0763 0.2762 72.62% 0.6486 –0.2609 36 0.0460

Relative to the mean five-year CDs have much more variation than one-year CDs. The standard deviation, variance, and range are all greater for five-year CDs compared to one-year CDs.

(a),(b)

(c)

The mean time is 232.78 seconds, and half the calls last greater than or equal to 228 seconds, so call duration is slightly right-skewed. The average scatter around the mean is 158.6866 seconds. The shortest call lasted 65 seconds, and the longest call lasted 1141 seconds.

Copyright ©2024 Pearson Education, Inc.


3.17

Excel output: Waiting Time Mean Median Mode Standard Deviation Sample Variance Range Minimum Maximum Sum Count First Quartile Third Quartile Interquartile Range Coefficient of Variation

(a) (b)

(c) (d)

3.18

(a) (b)

Mean = 4.287Median = 4.5 Variance = 2.683Standard deviation = 1.638Range = 6.08 Coefficient of variation = 38.21% Z scores: –0.05, 0.77, –0.77, 0.51, 0.30, –1.19, –0.46, –0.66, 0.13, 1.11, –2.39, 0.51, 1.33, 1.16, –0.30 There are no outliers. Since the mean is less than the median, the distribution is left-skewed. The mean and median are both under 5 minutes and the distribution is left-skewed, meaning that there are more unusually low observations than there are high observations. But six of the 15 bank customers sampled (or 40%) had wait times in excess of 5 minutes. So, although the customer is more likely to be served in less than 5 minutes, the manager may have been overconfident in responding that the customer would ―almost certainly‖ not wait longer than 5 minutes for service. Mean = 7.11Median = 6.68 Variance = 4.336Standard Deviation = 2.082Range = 6.67 Coefficient of variation = 29.27% Waiting Time 9.66 5.90 8.02 5.79 8.73 3.82 8.01 8.35

(c)

4.286667 4.5 #N/A 1.637985 2.682995 6.08 0.38 6.46 64.3 15 3.2 5.55 2.35 38.2112%

Z Score 1.222431 –0.583360 0.434799 –0.636190 0.775786 –1.582310 0.429996 0.593286

Waiting Time 10.49 6.68 5.64 4.08 6.17 9.91 5.47

Z Score 1.621050 –0.208750 –0.708230 –1.457440 –0.453690 1.342497 –0.789870

Since there are no Z values below –3.0 or above 3.0, there are no outliers. Because the mean is greater than the median, the distribution is right-skewed.

Copyright ©2024 Pearson Education, Inc.


3.18 cont.

(d)

3.19

(a)

1/2

 1  2.86%

(c) (a)

RG  1  0.42531  0.5504 

(b) (c) 3.21

RG  1  0.03231  0.0934  

First year: 1,000(1  0.0286)  $1,028.60 Second year: 1,028.60(1  0.0286)  $1,058.02 The rate of return for Microsoft was much better than that of GE.

(b)

3.20

The mean and median are both greater than five minutes. The distribution is right-skewed, meaning that there are some unusually high values. Further, 13 of the 15 bank customers sampled (or 86.7%) had waiting times greater than five minutes. So the customer is likely to experience a waiting time in excess of five minutes. The manager overstated the bank’s service record in responding that the customer would ―almost certainly‖ not wait longer than five minutes for service.

1/2

 1  48.65%

First year: 1,000(1  0.4865)  $1,486.53 Second year: 1,486.53(1  0.4865)  $2,209.79 The rate of return for Microsoft was much better than that of GE.

(a) Year 2018 2019 2020 2021 Geometric mean

DJIA -5.63 22.24 18.73 7.25 10.09%

S&P 500 -6.24 28.88 16.26 26.89 15.55%

NASDAQ -5.98 37.96 47.58 26.63 24.78%

Excel formula for DJIA =((1+B2/100)*(1+B3/100)*(1+B4/100)*(1+B5/100))^(1/4)–1 (b) (c) 3.22

NASDAQ had the best rate of return followed by S&P 500. All of the indices had positive returns. The return on the metals was mostly lower than the rate of the stock indices.

(a) Year 2019 2020 2021 Geometric mean

Platinum 22.12 10.44 -10.44 6.50%

Gold 18.83 24.60 -3.51 12.63%

Silver 15.36 47.44 -11.55 14.58%

Excel formula for Platinum =((1+B2/100)*(1+B3/100)*(1+B4/100))^(1/3)–1 (b) (c)

Silver had the best rate of return followed by gold. The return on the metals was mostly lower than the rate of the stock indices.

Copyright ©2024 Pearson Education, Inc.


3.23

(a) Mean of 3YrReturn% Type Growth Small Mid-Cap Large Value Small Mid-Cap Large Grand Total

Risk Level Low 17.66 14.42 15.58 19.05 12.69 11.33 11.42 13.34 15.48

Average 16.91 15.16 14.85 18.48 12.90 12.15 13.40 12.91 15.17

High 15.64 14.11 15.14 17.55 12.03 11.09 11.92 12.60 14.01

Grand Total 16.76 14.47 15.12 18.48 12.57 11.44 12.31 12.97 4.91

(b) StdDev of 3YrReturn% Type Growth Small Mid-Cap Large Value Small Mid-Cap Large Grand Total

(c)

Risk Level Low 5.36 5.82 5.44 4.75 3.10 2.88 2.47 3.13 5.14

Average 5.13 5.52 2.59 5.50 2.84 2.65 2.87 2.86 4.72

High 4.42 4.03 3.71 4.50 2.66 3.23 2.14 2.40 4.13

Grand Total 5.05 4.76 3.82 5.05 2.88 2.98 2.60 2.84 4.71

The mean three-year return is higher for growth funds than for value funds. For growth funds, the mean three-year rate of return is higher for large cap funds than for mid-cap and small cap funds. For value funds, the mean three-year rate of return is higher for large cap funds than for mid-cap and small cap funds. The mean three-year return is highest for low risk funds and is lower for average and high risk level funds. This occurs for both growth and value funds. The standard deviation of the three-year return varies much more for growth funds than for value funds. For the growth funds, the standard deviation is higher for large and small funds than mid-cap funds. The standard deviation for value funds does not vary much among different sized funds.

3.24

(a) Mean of 3YrReturn% Type Growth Small Mid-Cap Large Value Small Mid-Cap Large Grand Total

One 8.94 7.86 9.41 8.94 8.71 6.06 9.15 9.55 8.84

Two 14.29 10.79 13.03 16.10 11.24 10.37 11.33 11.40 12.92

Rating Three 16.93 14.35 15.69 18.82 12.27 11.62 12.04 12.52 14.95

Four 18.32 15.10 16.69 20.18 13.94 12.54 14.23 14.38 16.41

Copyright ©2024 Pearson Education, Inc.

Five 24.90 22.75 20.92 27.45 16.63 14.16 16.10 17.26 20.83

Grand Total 16.76 14.47 15.12 18.48 12.57 11.44 12.31 12.97 14.92


3.24 cont.

(b) StdDev of 3Yr Return% Type Growth Small Mid-Cap Large Value Small Mid-Cap Large Grand Total

(c)

One 6.17 0.87 2.47 8.13 2.24 2.38 1.37 1.80 4.76

Two 3.39 6.66 2.80 2.10 2.93 3.34 2.74 2.92 3.53

Rating Three 2.92 3.02 2.08 1.51 1.83 2.47 1.32 1.69 3.41

Four 2.70 1.99 1.69 1.78 1.72 1.39 2.09 1.46 3.18

Five 7.95 4.38 10.23 8.55 2.46 2.64 2.08 2.20 7.20

Grand Total 5.05 4.76 3.82 5.05 2.88 2.98 2.60 2.84 4.71

The mean three-year return is higher for growth funds than for value funds. For growth funds, the mean three-year rate of return is higher for large cap funds than for mid-cap and small cap funds. For value funds, the mean three-year return is similar for all sized funds. The mean three-year return is highest for four and five star rated funds and is lower for one and two star rated funds. This occurs for both growth and value funds. The standard deviation of the three-year return varies much more for growth funds than for value funds. For the growth funds, the standard deviation is higher for small and large cap funds than mid-cap funds. The standard deviation for value funds does not vary much among different sized funds.

3.25

(a) Mean of 3YrReturn% Type Small Low Average High Mid-Cap Low Average High Large Low Average High Grand Total

One 6.83 4.00 8.49 6.71 9.30 11.46 8.49 8.62 9.21 10.97 8.77 7.69 8.84

Two 10.63 8.85 10.73 11.08 12.35 10.43 12.89 13.35 13.80 14.61 12.98 14.08 12.92

Rating Three 13.44 12.82 13.90 13.43 14.32 14.39 14.67 13.29 15.78 15.80 15.69 16.00 14.95

Four 13.68 13.56 14.97 13.21 15.97 16.43 15.86 15.81 17.48 17.38 18.10 16.39 16.41

Copyright ©2024 Pearson Education, Inc.

Five 20.07 20.90 20.39 19.44 18.78 39.17 1.83 16.15 21.59 23.22 20.98 18.73 20.83

Grand Total 13.28 12.82 14.04 13.04 14.09 14.10 14.42 13.56 15.79 16.49 15.70 15.00 14.91


3.25 cont.

(b) StdDev of 3YrReturn% Type Small Low Average High Mid-Cap Low Average High Large Low Average High Grand Total

(c)

3.26

One 2.01 --1.43 1.69 2.01 1.51 1.76 1.73 6.12 1.81 7.31 7.68 4.76

Two 3.49 2.52 1.57 4.30 2.87 2.73 2.04 3.20 3.45 3.90 3.04 3.42 3.53

Rating Three 3.11 1.94 4.27 2.73 2.55 2.49 2.49 2.77 3.53 3.45 3.61 3.61 3.41

Four 2.10 2.00 2.33 1.97 2.12 2.05 1.97 2.63 3.33 3.01 3.39 3.53 3.18

Five 5.62 9.35 6.12 4.19 7.77 --2.41 0.91 7.67 7.95 7.99 5.07 7.20

Grand Total 4.40 4.72 4.86 4.01 3.67 5.0 2.74 3.42 4.96 4.99 5.19 4.34 4.71

The mean three-year return for large-cap funds is much higher than mid-cap or small-cap funds. In all risk categories except five-star funds have the highest mean three-year return. Large-cap, five-star, low-risk funds have the highest mean three -year return and the small-cap one-star, average-risk funds have the lowest standard deviation. The highest standard deviation is found in small-cap, low-risk, five-star funds.

(a) Mean of 3Yr Return% Type Growth Low Average High Value Low Average High Grand Total

One 8.94 11.76 8.21 6.28 8.71 7.03 9.30 8.81 8.84

Two 14.29 14.51 13.81 14.56 11.24 10.98 11.48 11.15 12.92

Rating Three 16.93 17.34 17.19 16.04 12.27 12.42 12.36 11.90 14.95

Four 18.32 18.38 18.55 17.71 13.94 14.24 14.45 13.24 16.41

Five 24.90 27.95 24.95 20.71 16.63 17.25 16.91 15.47 20.83

Grand Total 16.76 17.66 16.91 15.64 12.57 12.69 12.90 12.03 14.91

One 6.17 1.06 7.99 6.45 2.24 2.65 2.15 2.15 4.76

Two 3.39 3.81 2.41 3.82 2.93 3.76 2.60 2.69 3.53

Rating Three 2.92 2.40 3.19 2.84 1.83 1.73 1.63 2.27 3.41

Four 2.70 2.40 2.71 3.08 1.72 1.77 1.86 1.34 3.18

Five 7.95 8.53 8.80 3.59 2.46 2.40 2.69 1.85 7.20

Grand Total 5.05 5.36 5.13 4.42 2.88 3.10 2.84 2.66 4.71

(b) StdDev of 3Yr Return% Growth Low Average High Value Low Average High Grand Total

3.26

(c)

The mean three-year return is higher for growth funds than for value funds. For both growth and value funds, the mean three-year return does not differ much depending on the risk level of the funds. The mean three-year return is highest for four and five star rated funds and is lower for one and two star rated funds. This occurs for both the growth and value funds.

(c)

The standard deviation of the three-year return varies much more for growth funds than Copyright ©2024 Pearson Education, Inc.


cont.

3.27

for value funds. For the growth funds, the standard deviation does not vary much among different sized funds. (a) (b) (c)

Q1 = 3, Q3 = 9, interquartile range = 6 Five-number summary: 0 3 7 9 12 Box-and-whisker Plot

X

-10

(d)

3.28

(a) (b) (c)

3.29

(a) (b) (c)

5

10

Q1 = 4, Q3 = 9, interquartile range = 5 Five-number summary: 3 4 7 9 12

5

10

The distances between the median and the extremes are close, 4 and 5, but the differences in the tails are different (1 on the left and 3 on the right), so this distribution is slightly right-skewed. In 3.2 (d), because the mean and median are equal, the distribution is symmetric. The box part of the graph is symmetric, but the tails show right-skewness. Q1 = 3, Q3 = 8.5, interquartile range = 5.5 Five-number summary: 2 3 7 8.5 9

0

3.30

0

The distribution is left-skewed. Since one of the data points is different, 12 here in 3.27 and 11 in 3.3, the answers are not the same. The maximum for 3.27 is 12 and the maximum in 3.3 is 11. The rest of the five-number summary is the same.

0

(d)

-5

5

(d)

The distribution is left-skewed. Answers are the same.

(a) (b)

Q1 = –6.5, Q3 = 8, interquartile range = 14.5 Five-number summary: –8 –6.5 7 8 9

10

Copyright ©2024 Pearson Education, Inc.


3.30 cont.

(c) Box-and-whisker Plot

X

-10

(d) 3.31

(a) (b)

-5

0

5

10

The distribution is left-skewed. This is consistent with the answer in 3.4 (d). Q1 = 14, Q3 = 24.5, interquartile range = 10.5 Five-Number Summary

Minimum First Quartile Median Third Quartile Maximum

5 14 18 24.5 41

(c)

The distribution is right-skewed

Copyright ©2024 Pearson Education, Inc.


3.32

(a) (b)

Q1 = 56.9, Q3 = 68.6, interquartile range = 11.7 Five-Number Summary

Minimum First Quartile Median Third Quartile Maximum

32.1 56.9 63.95 68.6 79.1

(c)

The distribution is slightly right-skewed. 3.33

(a) (b)

Q1 = 139, Q3 = 273, interquartile range = 134

Copyright ©2024 Pearson Education, Inc.


3.33 cont.

(c)

The distribution is right-skewed 3.34

(a),(b) Five-Number Summary Before Halftime Halftime or After

Minimum First Quartile Median Third Quartile Maximum

4.5 5.3 5.8 6.4 7.4

4 5.05 5.6 5.95 7.3

Interquartile Range

0.9

1.1

Copyright ©2024 Pearson Education, Inc.


3.34 cont.

(c)

The boxplot plot for halftime or after is approximately symmetrical and the boxplot for the first or second quarter is also approximately symmetric. 3.35

(a), (b) Five-Number Summary

Minimum First Quartile Median Third Quartile Maximum Interquartile Range

One-Year CD 0.02 0.05 0.115 0.25 0.55

Five-Year CD 0.02 0.15 0.3 0.55 1

0.2

0.4

Copyright ©2024 Pearson Education, Inc.


3.35 cont.

(c)

Both of the distributions are right-skewed. 3.36

(a)

(b)

3.36 cont.

(b)

The waiting time for bank 1 in the commercial district of a city is skewed to the left. The waiting time for bank 2 in the residential area is skewed right. Copyright ©2024 Pearson Education, Inc.


(c)

The central tendency of the waiting times for the bank branch located in the commercial district (bank 1) of a city is lower than that of the branch located in the residential area (bank 2). There are a few longer than normal waiting times for the branch located in the residential area whereas there are a few exceptionally short waiting times for the branch located in the commercial area.

3.37

(a) (b)

Population mean = 6  2  9.4   3.07

3.38

(a) (b)

Population mean = 6  2  2.8   1.67

3.39

(a)

Population mean = 4.362, Population variance = 0.5944, Population standard deviation = 0.7709

(b)

Within one standard deviation of the mean is (3.59, 5.13) counting the data reveals 31 states or 31/50*100 = 62% of the states are within this range. Within two standard deviations of the mean is (2.82, 5.90), counting the data reveals 50 states or 50/50*100 = 100% of the states are within this range. Within three standard deviations of the mean is (2.05, 6.67), counting the data reveals 50 states or 100% of the states are within this range.

(c)

This is slightly different from 68%, 95% and 99.7% of the empirical rule.

3.40

(a) (b) (c) (d)

68% 95% at least 0 75% 88.89%   4 to   4 or –2.8 to 19.2

3.41

(a)

Population mean = 1.9226 Population standard deviation = 1.1886

(b)

On average, the cigarette tax is $1.92. The typical distance between the cigarette tax in each of the 50 states and the District of Columbia and the population mean cigarette tax is $1.19.

(a)

Population mean = 14.4267 Population variance = 19.022 Population standard deviation = 4.3614

(b)

Within one standard deviation of the mean is (10.065, 18.788) counting the data reveals 42 states or 42/51*100 = 82.35% of the states and DC are within this range.

3.42

Within two standard deviations of the mean is (5.704, 23.150), counting the data reveals 49 states or 49/51*100 = 96.08% of the states and DC are within this range. Within three standard deviations of the mean is (1.342, 27.511), counting the data reveals 50 states or 50/51*100 = 98.04% of the states and DC are within this range.

3.43

(c)

This is slightly different from 68%, 95% and 99.7% of the empirical rule.

(a)

Population mean = 397.1617 Population standard deviation = 638.721 Copyright ©2024 Pearson Education, Inc.


3.44

(b)

On average, the market capitalization for this population of 30 companies is $397.2 billion. The typical distance between the market capitalization and the mean market capitalization for this population of 30 companies is $638.7 billion.

(a) (b)

cov(X, Y) = 65.2909 S X2 = 21.7636, SY2 = 195.8727

r

(c)

3.45

3.46

(a)

65.2909  1.0 21.7636 195.8727

S X2 SY2 There is a perfect positive linear relationship between X and Y ; all the points lie exactly on a straight line with a positive slope.

(b)

The study suggests that the perceived usefulness of smartphones in an educational setting and the number of times students used their smartphone to send or read email for class purpose are positively correlated. There could be a cause and effect relationship between perceived usefulness of smartphones and the number of times students used their smartphone to send or read email for class purposes. The more a student uses their smartphone for class the more they may feel it is useful in an educational setting.

(a) (b)

cov(X, Y) = 133.3333 S X2 = 2200, SY2 = 11.4762 r

(c)

(d) 3.47

cov  X , Y 

cov  X , Y 

 0.8391 S X SY The correlation coefficient is more valuable for expressing the relationship between calories and sugar because it does not depend on the units used to measure calories and sugar. There is a strong positive linear relationship between calories and sugar.

(a)

Covariance sample First Weekend & US Gross = 1378.794 Covariance sample First Weekend & Worldwide Gross = 4781.008 Covariance sample US Gross & Worldwide Gross = 9937.710

(b)

Coefficient of correlation First Weekend & US Gross = 0.7600 Coefficient of correlation First Weekend & Worldwide Gross = 0.8596 Coefficient of correlation US Gross &Worldwide Gross = 0.9448

(c)

The correlation coefficient is more valuable for expressing the relationship because it does not depend on the units used.

(d)

There is a strong positive linear relationship between U.S. gross and worldwide gross, first weekend gross and worldwide gross and first weekend gross and U.S. gross.

Copyright ©2024 Pearson Education, Inc.


3.48

Excel Output: City Austin Baltimore Charlotte New Orleans Oklahoma City San Diego San Francisco Tampa Covariance r

3.49

Download Speed 726.3 740.2 1171.3 696.9 727.5 675.4 1008.9 817.6 726.4682143 0.246308328

Upload Speed 80.9 89.0 110.6 97.2 111.6 131.7 121.4 105.0 =COVARIANCE.S(B2:B9, C2:C9) =CORREL(B2:B9, C2:C9)

(a) (b) (c)

cov(X, Y) = 726.5 Correlation = r = 0.2463 The is a small positive linear relationship between download speed and upload speed.

(a)

Covariance sample between franchise value and payroll = 47.98 Coefficient of correlation between franchise value and payroll = 0.6766

(b)

Covariance of sample between payroll and number of wins = 396.57

(c)

Coefficient of correlation between payroll and number of wins = 0.4406

(d)

The covariance between payroll and number of wins is 396.57 which is much higher than the covariance between franchise value and payroll, which is only 47.98. There is more of a positive linear relationship between franchise value and payroll, with correlation value 0.6766. The positive linear relationship is not very strong between payroll and number of wins, with correlation value 0.4406.

3.50

We should look for ways to describe the typical value, the variation, and the distribution of the data within a range.

3.51

Central tendency or location refers to the fact that most sets of data show a distinct tendency to group or cluster about a certain central point.

3.52

The arithmetic mean is a simple average of all the values, but is subject to the effect of extreme values. The median is the middle ranked value, but varies more from sample to sample than the arithmetic mean, although it is less susceptible to extreme values. The mode is the most common value, but is extremely variable from sample to sample.

3.53

The first quartile is the value below which 25% of the total ranked observations will fall, the median is the value that divides the total ranked observations into two equal halves and the third quartile is the observation above which 25% of the total ranked observations will fall.

3.54

Variation is the amount of dispersion, or ―spread,‖ in the data.

3.55

The Z score measures how many standard deviations an observation in a data set is away from the mean.

3.56

The range is a simple measure, but only measures the difference between the extremes. The interquartile range measures the range of the center fifty percent of the data. The standard Copyright ©2024 Pearson Education, Inc.


deviation measures variation around the mean while the variance measures the squared variation around the mean, and these are the only measures that take into account each observation. The coefficient of variation measures the variation around the mean relative to the mean. The range, standard deviation, variance and coefficient of variation are all sensitive to outliers while the interquartile range is not. 3.57

The empirical rule relates the mean and standard deviation to the percentage of values that will fall within a certain number of standard deviations of the mean.

3.58

Chebyshev’s theorem applies to any type of distribution while the empirical rule applies only to data sets that are approximately bell-shaped. The empirical rule is more accurate than the Chebyshev rule in approximating the concentration of data around the mean.

3.59

Shape is the manner in which the data are distributed. The shape of a data set can be symmetrical or asymmetrical (skewed).

3.60

Skewness measures the extent to which the data values are not symmetrical around the mean. Kurtosis measures the extent to which values that are very different from the mean affect the shape of the distribution of a set of data. For symmetrical distributions, the boxplot is also symmetrical with the median splitting the box in half and whiskers of equal length. For left skewed distributions the boxplot’s left whisker will be longer and the median will be located in the right half of the box. For rights skewed distributions the boxplot’s the right whisker will be longer and the median will be located in the left half of the box.

3.61

3.62

The covariance measures the strength of the linear relationship between two numerical variables while the coefficient of correlation measures the relative strength of the linear relationship. The value of the covariance depends very much on the units used to measure the two numerical variables while the value of the coefficient of correlation is totally free from the units used.

3.63

The arithmetic mean is the most common measure of central tendency and is calculated by dividing the sum of the values in the data set by the number of data values in the set. The geometric mean is used to measure the rate of change of a variable over time. It is calculated by taking the nth root of the product of the n data values, where n is the number of data values.

3.64

The geometric mean is used to measure the rate of change of a variable over time. It is calculated by taking the nth root of the product of the n data values, where n is the number of data values. The geometric rate of return measures the mean percentage return of an investment per time period.

Copyright ©2024 Pearson Education, Inc.


3.65

Excel Output City Download Speeds

City Download Speeds

Minimum

672.72

Mean

797.77

First Quartile

730.99

Median

777.75

Median

777.75

Mode

Third Quartile

848.77

Minimum

672.72

Maximum

1916.28

Maximum

1916.28

Range

1243.56

IQR

177.78

Variance

134.3062

Coeff. of Variation

16.84%

Skewness

5.9260

Kurtosis

48.7143

Standard Error

(b)

18038.1432

Standard Deviation

Count

(a)

#N/A

100 13.4306

Download Speed: mean = 797.77, median = 777.75, first quartile = 730.99, third quartile = 848.77 Download Speed: range = 1243.56, interquartile range = 177.78, variance=18,038.14, standard deviation = 134.31, coefficient of variation = 16.84%

(c)

Copyright ©2024 Pearson Education, Inc.


(d)

The mean download speed is 797.77 Mbps with 50% of the cities having a download speed less than 777.75 Mbps and 50% of the cities having download speed between 730.99 Mbps and 848.77 Mbps.

Copyright ©2024 Pearson Education, Inc.


3.66

Minitab Output:

(a) (b)

Mean = 45.22Median = 45 first quartile = 25third quartile = 63 Range = 83Interquartile range = 38Variance = 535.79 Standard Deviation = 23.15Coefficient of variation = 51.19%

(c)

(d)

3.67

The distribution is approximately symmetric. The mean approval process takes 45.22 days with 50% of the policies being approved in less than 45 days. 50% of the applications are approved between 25 and 63 days. About 25% of the applicants are approved in no more than 25 days.

Excel output: Days Mean Median Mode Standard Deviation Sample Variance Range Minimum Maximum First Quartile Third Quartile Interquartile Range CV (a) (b)

43.04 28.5 5 41.92606 1757.794 164 1 165 14 54 40 97.41%

Mean = 43.04Median = 28.5 Q1 = 14Q3 = 54 Range = 164Interquartile range = 40Variance = 1,757.79 Standard deviation = 41.926Coefficient of variation = 97.41% Copyright ©2024 Pearson Education, Inc.


3.67 cont.

(c)

Box-and-whisker plot for Days to Resolve Complaints Box-and-whisker Plot

Days

0

(d)

50

100

150

The distribution is right-skewed. Half of all customer complaints that year were resolved in less than a month (median = 28.5 days), 75% of them within 54 days. There were five complaints that were particularly difficult to settle which brought the overall mean up to 43 days. No complaint took longer than 165 days to resolve.

3.68

(a)

Excel output:

3.68

(a)

Minitab Output: Copyright ©2024 Pearson Education, Inc.


cont.

(b)

Using the formulas in the text with n= 50, Q1 = (50+1)/4 ranked value = 12.75 ranked value so choose 13th ranked value which is 12. Q3 = 3(50+1)/4 ranked value = 38.25 ranked value so choose 38th ranked value which is 18 Therefore 5 number summary is min, Q1, median, Q3, max = 5, 12, 15, 18, 28 * Note Minitab uses a slightly different formula to calculate the quartiles

(c) (d) 3.69

(a) and (b) Excel Output

Mean Median Mode Minimum Maximum Range Variance Standard Deviation Coeff. of Variation Skewness Kurtosis Count Standard Error

Minimum First Quartile Median Third Quartile Maximum IQR

The distribution is symmetric. The service level is met because 75% of the class are answered in less than 18 seconds.

Electricity 110.255 108 95 73 160 87 321.9937

Gas 82.412 82 87 39 138 99 596.6871

Water 38.902 32 26 18 91 73 317.4502

Sewer 55.294 47 33 19 122 103 688.5718

Cable 46.490 47 47 23 55 32 21.5349

Internet 37.157 30 30 20 70 50 111.2549

17.9442

24.4272

17.8171

26.2407

4.6406

10.5477

16.28% 0.4685 0.2788 51

29.64% 0.5942 -0.1853 51

45.80% 1.2992 0.7754 51

47.46% 0.6390 -0.3136 51

9.98% -2.6055 13.3315 51

28.39% 0.9179 -0.0055 51

2.5127

3.4205

2.4949

3.6744

0.6498

1.4770

Electricity 73

Gas 39

Water 18

Sewer 19

Cable 23

Internet 20

95 108

64 82

27 32

33 47

45 47

30 30

123 160

93 138

46 91

72 122

47 55

50 70

28

29

19

39

2

20

Copyright ©2024 Pearson Education, Inc.


3.69 cont.

(c)

For Electricity, the data are slightly skewed right. For Gas, the data are slightly skewed left. For Water, the data are skewed right. For Sewer, the data are skewed right. For cable, the median and third quartile are the same, 47, and the data are skewed left. For Internet, the median and first quartile are the same, 30, and the data are skewed right. (d) 3.70

Coefficient of correlation between Electricity and Cable = 0.0763 Coefficient of correlation between Electricity and Internet = 0.1122

(a), (b) Bundle Score Typical Cost ($) Mean 54.775 24.175 Standard Error 4.367344951 2.866224064 Median 62 20 Mode 75 8 Standard Deviation 27.62151475 18.12759265 Sample Variance 762.9480769 328.6096154 Kurtosis -0.845357193 2.766393511 Skewness -0.48041728 1.541239625 Range 98 83 Minimum 2 5 Maximum 100 88 Sum 2191 967 Count 40 40 First Quartile 34 9 Third Quartile 75 31 Interquartile Range 41 22 CV 50.43% 74.98%

3.70

(c) Copyright ©2024 Pearson Education, Inc.


cont. Boxplot

Typical Cost ($)

Bundle Score

0

(d) (e)

3.71

20

40

60

80

100

The typical cost is right-skewed, while the bundle score is left-skewed. cov  X , Y  r  0.3465 S X SY The mean typical cost is $24.18, with an average spread around the mean equaling $18.13. The spread between the lowest and highest costs is $83. The middle 50% of the typical cost fall over a range of $22 from $9 to $31, while half of the typical cost is below $20. The mean bundle score is 54.775, with an average spread around the mean equaling 27.6215. The spread between the lowest and highest scores is 98. The middle 50% of the scores fall over a range of 41 from 34 to 75, while half of the scores are below 62. The typical cost is right-skewed, while the bundle score is left-skewed. There is a weak positive linear relationship between typical cost and bundle score.

Excel output: Teabags Mean Standard Error Median Mode Standard Deviation Sample Variance Kurtosis Skewness Range Minimum Maximum Sum Count First Quartile Third Quartile Interquartile Range CV

(a) (b) 3.71

(c)

5.5014 0.014967 5.515 5.53 0.10583 0.0112 0.127022 –0.15249 0.52 5.25 5.77 275.07 50 5.44 5.57 0.13 1.9237%

mean = 5.5014, median = 5.515, first quartile = 5.44, third quartile = 5.57 Range = 0.52Interquartile range = 0.13Variance = 0.0112, Standard Deviation = 0.10583 Coefficient of Variation = 1.924% The mean weight of the tea bags in the sample is 5.5014 grams while the middle ranked Copyright ©2024 Pearson Education, Inc.


cont.

weight is 5.515. The company should be concerned about the central tendency because that is where the majority of the weight will cluster around. The average of the squared differences between the weights in the sample and the sample mean is 0.0112 whereas the square-root of it is 0.106 gram. The difference between the lightest and the heaviest tea bags in the sample is 0.52. 50% of the tea bags in the sample weigh between 5.44 and 5.57 grams. According to the empirical rule, about 68% of the tea bags produced will have weight that falls within 0.106 grams around 5.5014 grams. The company producing the tea bags should be concerned about the variation because tea bags will not weigh exactly the same due to various factors in the production process, e.g. temperature and humidity inside the factory, differences in the density of the tea, etc. Having some idea about the amount of variation will enable the company to adjust the production process accordingly. (d) Box-and-whisker Plot

Teabags

5

(e)

3.72

(a)

5.2

5.4

5.6

5.8

6

The data is slightly left skewed. On average, the weight of the teabags is quite close to the target of 5.5 grams. Even though the mean weight is close to the target weight of 5.5 grams, the standard deviation of 0.106 indicates that about 75% of the teabags will fall within 0.212 grams around the target weight of 5.5 grams. The interquartile range of 0.13 also indicates that half of the teabags in the sample fall in an interval 0.13 grams around the median weight of 5.515 grams. The process can be adjusted to reduce the variation of the weight around the target mean.

Excel output: Five-number Summary Boston Vermont Minimum 0.04 0.02 First Quartile 0.17 0.13 Median 0.23 0.2 Third Quartile 0.32 0.28 Maximum 0.98 0.83

Copyright ©2024 Pearson Education, Inc.


3.72 cont.

(b) Box-and-whisker Plot

Vermont

Boston

0

0.2

0.4

0.6

0.8

1

Both distributions are right skewed. (c)

3.73

Both sets of shingles did quite well in achieving a granule loss of 0.8 gram or less. The Boston shingles had only two data points greater than 0.8 gram. The next highest to these was 0.6 gram. These two data points can be considered outliers. Only 1.176% of the shingles failed the specification. In the Vermont shingles, only one data point was greater than 0.8 gram. The next highest was 0.58 gram. Thus, only 0.714% of the shingles failed to meet the specification.

(a) Five-Number Summary

Minimum First Quartile Median Third Quartile Maximum

Center City 13 41 64 75 109

Metro Area 14 35 48 59 70

Copyright ©2024 Pearson Education, Inc.


3.73 cont.

(b)

The Metro Area is slightly left-skewed. The Center City is left-skewed.

3.74

(c)

Correlation coefficient of summated rating and cost of a mean for Center City = 0.6842 Correlation coefficient of summated rating and cost of a mean for Metro = 0.6867 There is a positive correlation between cost of a meal and summated rating. The higher priced restaurants tend to receive higher rating than the lower priced restaurants.

(d)

The median cost of a meal in the center city is $64 while the median cost of a meal in the metro area is $48. The range in costs of meals in the center city is greater than the range in costs of meals in the metro area.

(a), (b), (c) Calories Calories Protein Cholesterol

(d)

1 0.464411 0.177665

Protein

Cholesterol

1 0.141673

1

There is a rather weak positive linear relationship between calories and protein with a correlation coefficient of 0.46. The positive linear relationship between calories and cholesterol is quite weak at .178.

Copyright ©2024 Pearson Education, Inc.


3.75

(a), (b), (d) Excel output: Five-Number Summary

Minimum First Quartile Median Third Quartile Maximum

No 430 700 800 1500 3400

Yes 713.6 2941 4000 5000 9013

IQR

800

2059

No 1102.36 800 800 430 3400 2970 429003.5004 654.9836 59.42% 1.6263 2.5864 50 92.6287

Yes 4000.309231 4000 4000 713.6 9013 8299.4 3476664.9999 1864.5817 46.61% 0.5675 0.7561 65 231.2729

Mean Median Mode Minimum Maximum Range Variance Standard Deviation Coeff. of Variation Skewness Kurtosis Count Standard Error

The pay of coaches not in a major conference is right-skewed. The pay of coaches in a major conference is nearly symmetrical. 3.75

(c)

The mean pay of coaches not in a major conference is $1102.36 million with standard Copyright ©2024 Pearson Education, Inc.


cont.

deviation $654.98 million, whereas the mean pay of coaches in a major conference is $4000.31 million with standard deviation of $1864.58 million. The middle rank pay of coaches not in a major conference is $800 million. The middle rank pay of coaches not in a major conference is $4000 million. The difference between the highest and lowest pay not in a major conference is $2970 million. The difference between the highest and lowest salary in a major conference is $8299.40 million. (e)

3.76

On average, major conference coaches are paid more than non-major conference coaches. There is much more variation in pay among major conference coaches compared to nonmajor conference coaches.

(a), (b) Excel output: Five-Number Summary

Minimum First Quartile Median Third Quartile Maximum IQR

Mean Median Mode Minimum Maximum Range Variance Standard Deviation Coeff. of Variation Skewness Kurtosis Count Standard Error

Real Estate Tax Rate 0.0028 0.0066 0.0092 0.0156 0.0249

Average Home Price 119000 157200 194500 273100 615300

Annual Property Tax 606 1446 2006 3390 5419

0.009

115900

1944

Real Estate Tax Rate 0.011060784 0.0092 0.0057 0.0028 0.0249 0.0221

Average Home Price 233176.4706 194500 172500 119000 615300 496300 11925884235. 2941 109205.6969 46.83% 1.9417 4.3227 51 15291.8562

Annual Property Tax 2405.352941 2006 #N/A 606 5419 4813

0.0000 0.0054 48.81% 0.8005 -0.2385 51 0.0008

Copyright ©2024 Pearson Education, Inc.

1379255.4729 1174.4171 48.83% 0.8025 -0.2323 51 164.4513


3.76 cont.

(c)

Copyright ©2024 Pearson Education, Inc.


3.76 cont.

(c)

All three variables are highly right-skewed, especially average property value. (d)

Correlation coefficient between property taxes and home price = –0.1240.

(e)

There is a large variation in each of the variables from state to state.

Copyright ©2024 Pearson Education, Inc.


3.77

(a), (b) Excel output: Five-Number Summary

Minimum First Quartile Median Third Quartile Maximum IQR

Mean Median Mode Minimum Maximum Range Variance Standard Deviation Coeff. of Variation Skewness Kurtosis Count Standard Error

Mobile Connection Speed 6.2 13.7 21.8 31.3 59.6

Broadband Connection Speed 35.32 53.73 92.63 153.88 245.5

17.6

100.15

Mobile Connection Speed 23.98450704 21.8 6.8 6.2 59.6 53.4 165.2156 12.8536 53.59% 0.8854 0.4848 71 1.5254

Broadband Connection Speed 108.3716901 92.63 77.88 35.32 245.5 210.18 3561.3746 59.6773 55.07% 0.6167 -0.7173 71 7.0824

(c)

3.77

(d)

Both Mobile Connection Speed and Broadband Connection Speed are right-skewed. Correlation coefficient between mobile and broadband = 0.5513. Copyright ©2024 Pearson Education, Inc.


cont.

(e)

The average mobile connection speed for the various countries surveyed is 23 Mbps. Half of the countries surveyed had mobile connection speed less than 21.8 Mbps. One-quarter of the countries surveyed had mobile connection speed less than 13.7 Mbps while another quarter had mobile connection speed greater than 31.3 Mbps. The range for mobile connection speed is 53.4 Mbps with standard deviation of 12.85 Mbps. The average broadband connection speed for the various countries surveyed is 108 Mbps. Half of the countries surveyed had broadband connection speed less than 92.6 Mbps. One-quarter of the countries surveyed had broadband connection speed less than 53.7 Mbps, while another quarter had broadband connection speed greater than 153.9 Mbps. The range for broadband connection speed is 210.2 Mbps with standard deviation of 59.7 Mbps.

(f)

3.78

There is a positive linear relationship between mobile connection speed and broadband connection speed.

(a), (b) Mean Standard Error Median Mode Standard Deviation Sample Variance Kurtosis Skewness Range Minimum Maximum Sum Count First Quartile Third Quartile Interquartile Range CV

Abandonment rate in % (7:00AM-3:00PM) 13.86363636 1.625414306 10 9 7.623868875 58.12337662 0.723568739 1.180708144 29 5 34 305 22 9 20 11 54.99%

Copyright ©2024 Pearson Education, Inc.


3.78 cont.

(c) Boxplot

Abandonment rate in % (7:00AM3:00PM)

0

5

10

15

20

25

30

35

The data are right-skewed. (d) (e)

3.79

r = 0.7575 The average abandonment rate is 13.86%. Half of the abandonment rates are less than 10%. One-quarter of the abandonment rates are less than 9% while another one-quarter are more than 20%. The overall spread of the abandonment rates is 29%. The middle 50% of the abandonment rates are spread over 11%. The average spread of abandonment rates around the mean is 7.62%. The abandonment rates are right-skewed.

(a), (b) Excel Output Average Commuting Time

Average Commuting Time

Minimum

19.5

Mean

25.385

First Quartile

23

Median

24.75

Median

24.75

Mode

23.4

Third Quartile

27.1

Minimum

19.5

Maximum

36.3

Maximum

36.3

Range

16.8

IQR

4.1

Variance

11.0869

Standard Deviation

3.3297

Coeff. of Variation

13.12%

Copyright ©2024 Pearson Education, Inc.


Skewness

0.8592

Kurtosis

0.5924

Count

100

Standard Error

Copyright ©2024 Pearson Education, Inc.

0.3330


3.79 cont.

(c)

The data are skewed right. (d)

3.80

The average weekly commuting time is 25.385 minutes. Half of the average weekly commuting time is less than 24.75 minutes. One-quarter of the average weekly commuting time is less than 23 minutes, while another one-quarter is more than 27.1 minutes. The range of average weekly commuting time is 16.8 minutes. The middle 50% of the average weekly commuting time spreads over 4.1 minutes. The typical spread of average weekly commuting time around the mean is 3.33.

(a), (b) Excel Output Average FICO Score

Average FICO Score

Minimum

675

Mean

First Quartile

699

Median

717

Median

717

Mode

690

Third Quartile

726

Minimum

675

Maximum

739

Maximum

739

Range

64

IQR

27

Variance

Copyright ©2024 Pearson Education, Inc.

712.745098

242.0337


Standard Deviation

15.5574

Coeff. of Variation

2.18%

Skewness

-0.5566

Kurtosis

-0.6947

Count Standard Error 3.80 cont.

51 2.1785

(c)

Since the mean is less than the median, the data are left-skewed. (d)

3.81

The mean of the average credit scores is 712.7451. Half of the average credit scores are less than 717. One-quarter of the average credit scores are less than 699 while another one-quarter is more than 726. The range of the average credit score is 64. The middle 50% of the average credit scores is spread over 27. The typical spread of the average credit scores around the mean is 15.557.

The variables ―gender‖ and ―major‖ are categorical and cannot be summarized with boxplots because boxplots are created using the data from numerical variables. Similarly, the mean is a static computed on numerical variables so is not appropriate for the categorical variables ―gender‖ or ―major‖. Pie charts are used for categorical variables, so they should not be created using data from the numerical variables ―grade point average‖ and ―height.‖ Copyright ©2024 Pearson Education, Inc.


Copyright ©2024 Pearson Education, Inc.


3.82

Excel output:

Minimum First Quartile Median Third Quartile Maximum IQR

Mean Median Mode Minimum Maximum Range Variance Standard Deviation Coeff. of Variation Skewness Kurtosis Count Standard Error

Five-Number Summary Alcohol Calories 2.4 55 4.4 130.5 4.92 151 5.65 170.5 11.5 330 1.25

40

Carbohydrates 1.9 8.65 12 14.7 32.1 6.05

Alcohol 5.269490446 4.92 4.2 2.4 11.5 9.1 1.8344 1.3544

Calories 155.656051 151 110 55 330 275 1917.4707 43.7889

Carbohydrates 12.05171975 12 12 1.9 32.1 30.2 24.8094 4.9809

25.70%

28.13%

41.33%

1.8405 4.5814 157 0.1081

1.2061 2.9620 157 3.4947

0.4912 1.0811 157 0.3975

Copyright ©2024 Pearson Education, Inc.


3.82 cont.

Copyright ©2024 Pearson Education, Inc.


3.82 cont.

The amount of % alcohol is right skewed with an average at 5.269%. Half of the beers have % alcohol below 4.92%. The middle 50% of the beers have alcohol content spread over a range of 1.25%. The highest alcohol content is at 11.5% while the lowest is at 2.4%. The typical spread of alcohol content around the mean is 1.354%. The number of calories is right-skewed with an average at 155.66. Half of the beers have calories below 151. The middle 50% of the beers have calories spread over a range of 40. The highest number of calories is 330 while the lowest is 55. The typical spread of calories around the mean is 43.79. The number of carbohydrates is symmetric average at 12.052, which is almost identical to the median at 12.000. Half of the beers have carbohydrates below 12.000. The middle 50% of the beers have carbohydrates spread over a range of 6.05. The highest number of carbohydrates is 32.10 while the lowest is 1.9. The typical spread of carbohydrates around the mean is 4.98.

Copyright ©2024 Pearson Education, Inc.


Chapter 4

4.1

(a) (b) (c) (d)

Simple events include tossing a head or tossing a tail. Joint events include tossing three heads (HHH), a head followed by two tails (HTT), a tail followed by two heads (THH), and three tails (TTT). Tossing a tail on the first toss The sample space is the collection of (HHH), (HHT), (HTH), (THH), (TTH), (THT), (HTT), and (TTT).

4.2

(a) (b) (c)

Simple events include selecting a red ball. Selecting a white ball The sample space is consists of the 12 red balls and the 8 white balls.

4.3

(a)

30 1   0.33 90 3

(b)

60 2   0.67 90 3

(c)

10 1   0.11 90 9

(d)

30 30 10 50 5      0.556 90 90 90 90 9

(a)

60 3   0.6 100 5

(b)

10 1   0.1 100 10

(c)

35 7   0.35 100 20

(d)

60 65 35 90 9      0.9 100 100 100 100 10

(a)

a priori

4.4

4.5

Copyright ©2024 Pearson Education, Inc.

v


vi Chapter 5: Discrete Probability Distributions

4.6

4.7

(b)

Subjective

(c)

a priori

(d)

Empirical

(a)

Mutually exclusive, not collectively exhaustive.

(b)

Not mutually exclusive, not collectively exhaustive.

(c)

Mutually exclusive, not collectively exhaustive.

(d)

Mutually exclusive, collectively exhaustive

(a)

The joint probability of mutually exclusive events (being listed on the New York Stock Exchange and NASDAQ) is zero.

(b)

The joint probability of the events (owning a smartphone and a tablet) is not zero because a consumer can own both a smartphone and a tablet at the same time.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems vii 4.7

(c)

cont.

4.8

4.9

4.10

The joint probability of mutually exclusive events (being an Apple cellphone and a Samsung cellphone) is zero.

(d)

The joint probability of the events (an automobile that is a Toyota and was manufactured in the U.S) is not zero because a Toyota can be manufactured in the U.S.

(a)

A respondent in the 50–59 age group.

(b)

A respondent in the 50–59 age group who has more than $100,000 in retirement savings.

(c)

A retirement account that has more than $100,000 in savings.

(d)

Because age group and amount of retirement savings are two different characteristics.

(a)

P(retirement account less than $100,000) 

(b)

P(60–69 and less than $100,000) 

(c)

P(60–69 or less than $100,000) 

(d)

Question (b) asks for the intersection: both conditions being satisfied, the “and” composition. Question (c) asks for the “or” composition: one of the two conditions being satisfied. The difference between the two answers are the outcomes where one of the two conditions is satisfied, but not the other.

530  380  0.455 2,000

380  0.19 2,000

1,000  910  380  0.765 2,000

Answers will vary. (a)

A marketer who uses LinkedIn.

(b)

A B2B marketer who uses LinkedIn.

(c)

A marketer who uses B2C.

(d)

A marketer who uses Facebook and is a B2C marketer is a joint event because it consists of two characteristics, uses of Facebook and is a B2C marketer. Copyright ©2024 Pearson Education, Inc.


viii Chapter 5: Discrete Probability Distributions

4.11

4.12

4.12

P(chosen Facebook) 

(b)

P(B2B and chosen LinkedIn) 

(c)

P(B2B or chosen LinkedIn) 

(d)

The probability of “is B2B or has chosen LinkedIn” includes the probability of “is B2B” and the probability of “chosen LinkedIn” minus the probability of “is B2B and chosen LinkedIn.”

(a)

P(fully supports increased use of educational technologies in higher ed) 686   0.3805 1,803

(b)

P(is a digital learning leader) 

(c)

P(fully supports increased use of educational technologies or is a digital learning leader) 686  206  175 717    0.3977 1,803 1,803

(d)

The probability in (c) includes those who fully support increased use of educational

cont.

4.13

1,030  0.5150 2,000

(a)

350  0.1750 2,000

1,000  400  50  0.6750 2,000

206  0.1143 1,803

technologies in higher education and those who are digital learning leaders.

89  0.7008 127

(a)

P (has a relationship) 

(b)

P(has a relationship or is Latino) 

(c)

P(has a relationship and is Latino) 

(d)

The probability of “has a relationship or is Latino” includes the probability of “has a relationship” and the probability of “is Latino” minus the joint probability of “has a relationship and is Latino.”

89  52  35  0.8346 127 35  0.2756 127

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ix

4.14

Important to Understand Privacy Policy

Older adults

Younger adults

Total

Yes

911

195

1106

No

90

65

155

Total

1001

260

1261

(a)

P(is important to have a clear understanding of a company’s privacy policy before 1,106  0.8771 signing up for its service online)  1,261

(b)

P(is an older adult and indicates it is important to have a clear understanding of a 911  0.7224 company’s privacy policy before signing up for its service online)  1,261

(c)

P(is an older adult or indicates it is important to have a clear understanding of a company’s privacy policy before signing up for its service online) 1,001  1,106  911 1,196    0.9485 1,261 1,261

(d)

P(is an older adult or a younger adult) 

1,261  1.00 1,261

4.15

Needs Warranty-Related Repair

U.S.

Non-U.S.

Total

Yes

0.025

0.015

0.04

No

0.575

0.385

0.96

Total

0.600

0.400

1.00

(a)

P(needs warranty repair) = 0.04 Copyright ©2024 Pearson Education, Inc.


x Chapter 5: Discrete Probability Distributions (b)

P(needs warranty repair and manufacturer based in U.S.) = 0.025

(c)

P(needs warranty repair or manufacturer based in U.S.) = 0.615

(d)

P(needs warranty repair or manufacturer not based in U.S.) = 0.425

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xi 4.16

4.17

10 1   0.33 30 3

(a)

P( A | B) 

(b)

P ( A | B) 

20 1   0.33 60 3

(c)

P( A | B) 

40 2   0.67 60 3

(d)

1 Since P ( A | B )  P ( A)  , events A and B are statistically independent. 3

(a)

P( A | B) 

(b)

P ( A | B) 

35 7   0.5385 65 13

(c)

P ( A | B) 

30 6   0.4615 65 13

(d)

Since P( A | B)  0.2857 and P(A) = 0.40, events A and B are not statistically

10 2   0.2857 35 7

independent.

P( A and B) 0.4 1    0.5 P( B) 0.8 2

4.18

P( A | B) 

4.19

P(A and B) = P(A) P(B) = (0.7)  (0.6)  0.42

4.20

Since P(A and B) = 0.20 and P(A) P(B) = 0.12, events A and B are not statistically independent.

4.21

(a)

P(less than $100k|50-59) 

(b)

P (50-59 | less than $100k) 

530  0.5300 1,000 530  0.5824 910

Copyright ©2024 Pearson Education, Inc.


xii Chapter 5: Discrete Probability Distributions (c)

The conditional events are reversed.

(d)

Since P(less than $100k|50-59) 

530  0.5300 is not equal to 1,000

910  0.4550 , having a retirement savings of less than $100,000 2,000 and age group are not independent. P(less than $100k) 

4.22

P(Chosen Facebook|B2B) 

(b)

P(B2B | Chosen Facebook) 

(c)

The conditional events are reversed.

(d)

4.23

400  0.3883 1,030

400 1,030  0.4000 and P(Chosen Facebook)   0.5150 1,000 2,000 are not equal. Therefore, business focus and social media used are not independent. P(Chosen Facebook|B2B) 

(a)

P(Latino | has a relationship) 

35  0.3933 89

(b)

P(has a relationship | Latino) 

35  0.6731 52

(c)

The conditional events are reversed.

(d)

4.24

400  0.4000 1,000

(a)

(a)

35 52  0.3933 and P (Latino)   0.4094 are not 89 127 equal. Therefore, having a business relationship with a bank or credit union and ethnicity of the small business owner are not independent. P(Latino | has a relationship) 

P(fully supports increased use of educational technologies in higher ed | faculty 511  0.3200 member)  1,597

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xiii (b)

P(does not fully supports increased use of educational technologies in higher ed | faculty member) 

1,086  0.6800 1,597

(c)

P(fully supports increased use of educational technologies in higher ed | digital learning 175 leader)   0.8495 206

(d)

P(does not fully supports increased use of educational technologies in higher ed | digital 31 learning leader)   0.1505 206

4.25 Important to Understand Privacy Policy

Older adults

Younger adults

Total

Yes

911

195

1106

No

90

65

155

Total

1001

260

1261

911  0.9101 1,001

(a)

P(important to understand privacy policy | older adult) 

(b)

P(younger adult | does not indicate that it is important to understand privacy policy) 65   0.4194 155

(c)

Since P(younger adult | does not indicate that it is important to understand privacy

260  0.2062, indicates that it is 1,261 important to understand privacy policy and adult age are not independent. policy) = 0.4194 is not equal to P(younger adult) 

Copyright ©2024 Pearson Education, Inc.


xiv Chapter 5: Discrete Probability Distributions 4.26

4.27

Needs Warranty-Related Repair

U.S.

Non-U.S.

Total

Yes

0.025

0.015

0.04

No

0.575

0.385

0.96

Total

0.600

0.400

1.00

0.025  0.0417 0.6

(a)

P(needs warranty repair | manufacturer based in U.S.) =

(b)

P(needs warranty repair | manufacturer not based in U.S.) =

(c)

Since P(needs warranty repair | manufacturer based in U.S.) =

(a)

P(higher for the year) 

(b)

P(higher for the year | higher first week) 

(c)

Since P (higher for the year)  0.6620 is not equal to

0.015  0.0375 0.4

0.025  0.0417 is not 0.6 equal to P(needs warranty repair) = 0.04, the two events are not independent.

39  8  0.6620 39  8  12  12 39  0.8298 39  8

P (higher for the year | higher first week)  0.8298 , the two events, first-week

performance and annual performance, are not statistically independent.

4.28

(d)

Answers will vary.

(a)

P(both queens) = 4  3  12

(b)

P(10 followed by 5 or 6) = 4  8  32  8  0.012

(c)

P(both queens) = 4  4  16

52 51

2,652

52 51

52 52

2,704

1  0.0045 221

2,652

663

1  0.0059 169

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xv

4.29

(d)

P(blackjack) = 16  4  4  16  128  32  0.0483

(a)

P(2 red cellphones) =

(b)

P(1 red cellphone and 1 black cellphone) =

(c)

2 P(3 red cellphones) =    0.01097 9

(d)

(a) P(2 red cellphones) =

52 51

52 51

2,652

663

2 1 2 1     0.0278 9 8 72 36 7 2 2 7 28 7       0.3889 9 8 9 8 72 18

3

2 2 4    0.0494 9 9 81

2 (b) P(1 red cellphone and 1 black cellphone) = 2  7     0.3457  9  9 

4.30 4.31 4.32

P( A | B)  P( B) 0.8  0.05 0.04    0.095 P( A | B)  P( B)  P( A | B)  P( B) 0.8  0.05  0.4  0.95 0.42 P( A | B)  P( B) 0.6  0.3 0.18 P( B | A)     0.340 P( A | B)  P( B)  P( A | B)  P( B) 0.6  0.3  0.5  0.7 0.53 P( B | A) 

(a)

D = has disease

T = tests positive

(b)

P(T | D)  P( D) 0.9  0.03 0.027    0.736 P(T | D)  P( D)  P(T | D)  P( D) 0.9  0.03  0.01  0.97 0.0367 P(T  | D)  P( D) P( D | T )  P(T  | D)  P( D)  P(T  | D)  P( D) 0.99  0.97 0.9603    0.997 0.99  0.97  0.10  0.03 0.9633

(a)

S = Shop online in office

P( D | T ) 

4.33

M = shopper is male

(b)

P( M | S )  P( S ) P( M | S )  P( S )  P( M | S )  P(S ) 0.57  0.23 0.1311    0.2618 0.57  0.23  0.48  0.77 0.4527 P ( M )  0.57  0.23  0.48  0.77  0.5007

(a)

B = Base Construction Co. enters a bid

P( S | M ) 

4.34

O = Olive Construction Co. wins the contract

Copyright ©2024 Pearson Education, Inc.


xvi Chapter 5: Discrete Probability Distributions P ( B | O) 

4.35

(b)

P  O   0.175  0.15  0.325

(a)

W = Women started business P(W | A) 

4.36

4.37

P(O | B)  P( B) 0.5  0.3 0.15    0.4615 P(O | B)  P( B)  P(O | B)  P( B) 0.5  0.3  0.25  0.7 0.325

A = Above $50,000 revenues

P( A | W )  P(W ) 0.2  0.35 0.07    0.3097 P( A | W )  P(W )  P( A | W )  P(W ) 0.2  0.35  0.24  0.65 0.226

(b)

P  A  0.2  0.35  0.24  0.65  0.226

(a)

P(huge success | favorable review) =

(b)

0.099  0.2157 0.459 0.14 P(moderate success | favorable review) =  0.3050 0.459 0.16 P(break even | favorable review) =  0.3486 0.459 0.06 P(loser | favorable review) =  0.1307 0.459 P(favorable review) = 0.99(0.1) + 0.7(0.2) + 0.4(0.4) + 0.2(0.3) = 0.459

(a)

P(A rating | issued by city) =

(b) (c)

0.35  0.625 0.56 P(issued by city) = 0.5(0.7) + 0.6(0.2) + 0.9(0.1) = 0.56 P(issued by suburb) = 0.4(0.7) + 0.2(0.2) + 0.05(0.1) = 0.325

4.38

310  59049

4.39

(a) (b) (c)

(30)(30)(30) = 27,000  1  1  1       0.000037  30  30  30  In ―dial combination,‖ the order of the combination is important while order is irrelevant in the mathematical combination expressed by equation (4.14). 27  128 67  279936 There are two mutually exclusive and collectively exhaustive outcomes in (a) and six in (b).

4.40

(a) (b) (c)

4.41

(7)(4)(3) = 84

4.42

(5)(4)(9)(5)(6) = 5,400

4.43

n !  4!  (4)(3)(2)(1)  24

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xvii 4.44

5!  (5)(4)(3)(2)(1)  120; not all the orders are equally likely because the teams have a different probability of finishing first through fifth.

P4 

5! 5!   120 (5  4)! 1!

4.45

5

4.46

n!  6!  720

4.47

8

4.48

10

4.49

7

4.50

100

4.51

20

4.52

4.53

C2 

8!  28 2! 6!

C4 

10!  210 4! 6!

C4 

7! (7)(6)(5)   35 4! 3! (3)(2)(1)

C2 

C3 

100! (100)(99)   4,950 2! 98! 2

20! (20)(19)(18)   1,140 3!17! (3)(2)(1)

With a priori probability, the probability of success is based on prior knowledge of the process involved. With empirical probability, outcomes are based on observed data. Subjective probability refers to the chance of occurrence assigned to an event by a particular individual. A simple event can be described by a single characteristic. Joint probability refers to phenomena containing two or more events.

4.54

The general addition rule is used by adding the probability of A and the probability of B and then subtracting the joint probability of A and B.

4.55

Events are mutually exclusive if both cannot occur at the same time. Events are collectively exhaustive if one of the events must occur.

4.56

If events A and B are statistically independent, the conditional probability of event A given B is equal to the probability of A.

4.57

When events A and B are independent, the probability of A and B is the product of the probability of event A and the probability of event B. When events A and B are not independent, the probability of A and B is the product of the conditional probability of event A given event B and the probability of event B.

4.58

Bayes’ theorem uses conditional probabilities to revise the probability of an event in the light of new information. Copyright ©2024 Pearson Education, Inc.


xviii Chapter 5: Discrete Probability Distributions 4.59

In Bayes’ theorem, the prior probability is an unconditioned probability while the revised probability is the probability of the original event updated in light of some new information.

4.60

For Counting Rule 1, the number of possible events is the same for each trial. Counting Rule 2 allows for the number of possible events to differ for each trial.

4.61

In combinations, the order of the elements in the arrangement does not matter, whereas in permutations, the order of the arrangement of the elements does matter.

4.62

(a) Generation

4.63

Interested in Investment Learning

Z

X

Total

Yes

390

305

695

No

110

195

305

Total

500

500

1,000

(b)

Answers may vary. A simple event is “Generation Z” A joint event is “Generation Z and interested in investment learning.”

(c)

Answers may vary. A joint event is “Generation Z and interested in investment learning.”

(d)

P(interested in learning) 

(e)

P(interested in learning and generation Z) 

(f)

P(interested in learning or generation Z) 

(g)

They are not independent because generation Z and generation X have different probabilities of interest in investment learning.

(a)

P(is an HR employee) 

695  0.6950 1,000 390  0.3900 1,000

695  500  390  0.8050 1,000

132  0.33 400

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xix (b)

P(is an HR employee or indicates that absenteeism is an important metric) 132  126  54 204    0.51 400 400

(c)

P(does not indicate that presenteeism is an important metric and is a non-HR employee) 

4.64

4.65

225  0.5625 400

(d)

P(does indicate that presenteeism is an important metric or is a non-HR employee) 304  268  225 347    0.8675 400 400

(e)

P( a non-HR employee given they indicate that presenteeism is an important metric) 43   0.4479 96

(f) (g)

They are not independent They are not independent

(a)

P(at least once a week) 

(b)

P(at least once a week or at least once a month) 

(c)

P(Gen Z or several times a day) 

(d)

P(Gen Z and several times a day) 

(e)

P(never | Gen Boomer) 

(a)

P(B2B service) 

(b)

P(budget based on previous year) (0.521)(1,000)  (0.39)(1,000)  (0.30)(1,000)  (0.278)(1,000) 1, 489    0.3723 4,000 4,000

(c)

P(B2B service and budget based on previous year) 

132  0.1317 1,002 132  73 205   0.2046 1,002 1,002

140  252  50 342   0.3413 1,002 1,002 50  0.0499 1,002

157  0.4604 341

1,000  0.25 4,000

(0.521)(1,000)  0.1303 4,000

Copyright ©2024 Pearson Education, Inc.


xx Chapter 5: Discrete Probability Distributions

1,000  (0.521)(1,000)  0.3803 4,000

(d)

P(B2B service or budget based on previous year) 

(e)

P(B2B service | budget based on previous year) 

(f)

P(budget based on previous year | B2B product or services) (0.521)(1,000)  (0.39)(1,000)   0.4555 2,000

(g)

They are not independent.

(0.39)(1,000)  0.39 1,000

Chapter 5

5.1

PHStat output for Distribution A:

Probabilities & Outcomes:

P

X

Y

0.5

0

0.2

1

0.15

2

0.1

3

0.05

4

Statistics E(X)

1

E(Y)

0

Variance(X) Standard Deviation(X)

1.5 1.224745

Variance(Y)

0

Standard Deviation(Y)

0

Covariance(XY)

0 Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxi Variance(X+Y)

1.5

Standard Deviation(X+Y)

1.224745

PHStat output for Distribution B:

Probabilities & Outcomes:

P

X

Y

0.05

0

0.1

1

0.15

2

0.2

3

0.5

4

Statistics E(X)

3

E(Y)

0

Variance(X)

1.5

Standard Deviation(X)

1.224745

Variance(Y)

0

Standard Deviation(Y)

0

Covariance(XY)

0

Variance(X+Y)

1.5

Standard Deviation(X+Y)

1.224745

(a)

Distribution ADistribution B X

P(X) 0

X*P(X)

0.50 0.00

X

P(X)

X*P(X)

0

0.50 0.00

Copyright ©2024 Pearson Education, Inc.


xxii Chapter 5: Discrete Probability Distributions 1

0.20 0.20

1

0.20 0.20

2

0.15 0.30

2

0.15 0.30

3

0.10 0.30

3

0.10 0.30

4

0.05 0.20

4

0.05 0.20

μ= 1.00

μ= 1.00

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxiii 5.1 cont.

(b) Distribution A (X –  )2

X

(X –  )2*P(X)

P(X)

0

(–1)2

0.50 0.50

1

(0)2

0.20 0.00

2

(1)2

0.15 0.15

3

(2)2

0.10 0.40

4

(3)2

0.05 0.45

 2 = 1.50

  ( X –  )2  P( X ) = 1.22

Distribution B (X –  )2

X

P(X)

(X –  )2*P(X)

0

(–3)2

0.05 0.45

1

(–2)2

0.10 0.40

2

(–1)2

0.15 0.15

3

(0)2

0.20 0.00

4

(1)2

0.50 0.50

 2 = 1.50

  ( X –  )2  P( X ) = 1.22

5.2

(c)

For distribution A, P(X  3) = 0.10 + 0.05 = 0.15 For distribution B, P(X  3) = 0.20 + 0.50 = 0.70

(d)

The means are different, but the variances are the same.

(a)–(b)

Copyright ©2024 Pearson Education, Inc.


xxiv Chapter 5: Discrete Probability Distributions X

P(X)

X*P(X)

(X –  X )2

(X –  X )2*P(X)

0

0.10

0.00

4

0.40

1

0.20

0.20

1

0.20

2

0.45

0.90

0

0.00

3

0.15

0.45

1

0.15

4

0.05

0.20

4

0.20

5

0.05

0.25

9

0.45

(a) Mean =

2.00

Variance = 1.40 (b) Stdev =

5.3

(c)

P(X  2) = 0.45 + 0.15 + 0.05 + 0.05 = 0.70

(a)

Based on the fact that the odds of winning are expressed out with a base of 31,478, you would think that the automobile dealership sent out 31,478 fliers. 1 1 31,476   iN1 X i P  X i   5000  60 5  $5.16 31,478 31,478 31,478

(b) (c) (d)

5.4

1.18321596

  iN1  X i  E  X i  P  X i  = $28.15 The total cost of the prizes is $5,000 + $60 + 31,476 * $5 = $162,440. Assuming that the cost of producing the fliers is negligible, the cost of reaching a single customer is $162,440/31,478 = $5.16. The effectiveness of the promotion will depend on how many customers will show up in the show room. 2

(a) X

P(X)

$–1

21/36

$+1

15/36

X

P(X)

$–1

21/36

$+1

15/36

(b)

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxv (c)

(d)

5.5

X

P(X)

$–1

30/36

$+4

6/36

$ – 0.167 for each method of play

Excel Output:

Arrivals = X

Frequency

Probability = Frequency/n

[X-E(X)]^2

0

15

0.075

8.0089

=(A2-B13)^2

1

31

0.155

3.3489

=(A3-B13)^2

2

47

0.235

0.6889

=(A4-B13)^2

3

41

0.205

0.0289

=(A5-B13)^2

4

29

0.145

1.3689

=(A6-B13)^2

5

24

0.12

4.7089

=(A7-B13)^2

6

10

0.05

10.0489

=(A8-B13)^2

7

2

0.01

17.3889

=(A9-B13)^2

8

1

0.005

26.7289

=(A10-B13)^2

n

200

=SUM(B2:B10)

E(X)

2.83

=SUMPRODUCT(A2:A10, C2:C10)

Variance(X)

2.8611

=SUMPRODUCT(C2:C10, D2:D10)

Std Dev (X)

1.691479

=SQRT(B14)

(a)   E ( X )  2.83 (b)   1.69 Copyright ©2024 Pearson Education, Inc.


xxvi Chapter 5: Discrete Probability Distributions (c) P ( X  2)  0.075  0.155  0.230

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxvii 5.6

Excel Output:

Approved = X

Frequency

Probability = Frequency/n

[X-E(X)]^2

0

13

0.125

4.393861

=(A2-B12)^2

1

29

0.278846

1.201553

=(A3-B12)^2

2

28

0.269231

0.009246

=(A4-B12)^2

3

15

0.144231

0.816938

=(A5-B12)^2

4

11

0.105769

3.62463

=(A6-B12)^2

5

5

0.048077

8.432322

=(A7-B12)^2

6

2

0.019231

15.24001

=(A8-B12)^2

7

1

0.009615

24.04771

=(A9-B12)^2

n

104

=SUM(B2:B9)

E(X)

2.096154

=SUMPRODUCT(A2:A9, C2:C9)

Variance(X)

2.317678

=SUMPRODUCT(C2:C9, D2:D9)

Std Dev (X)

1.522392

=SQRT(B13)

(a)   E ( X )  2.0962 (b)   1.5224 (c) P( X  1) 

5.7

28 15 11 5 2 1       0.5962 104 104 104 104 104 104

Excel output: Probability 0.1

Stock X

Stock Y

[X-E(X)]^2

-5

-100

13340.25

[Y-E(Y)]^2

Copyright ©2024 Pearson Education, Inc.

62001


xxviii Chapter 5: Discrete Probability Distributions 0.3

10

50

10100.25

9801

0.4

170

210

3540.25

3721

0.2

200

300

8010.25

22801

E(X)

110.5

E(Y)

Variance(X)

7382.25

Variance(Y) 15189

Std Dev(X)

85.92002095

Std Dev(Y)

(a) (b) (c)

149

123.2436611

E ( X )  $110.5 E (Y )  $149  X  $85.92  Y  $123.24 Stock Y gives the investor a higher expected return than stock X, but also has a higher standard deviation. Risk-averse investors would invest in stock X, whereas risk takers would invest in stock Y.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxix 5.8

Excel output: Probability

Bond X

Stock Y

[X-E(X)]^2

0.01

-200

-700

63731

568516

0.15

-75

-300

16243.5

125316

0.09

30

-100

504.0025

23716

0.35

60

100

57.0025

2116

0.3

100

150

2261.003

9216

0.1

120

350

4563.003

87616

E(X)

52.45

E(Y)

Variance(X)

4273.748

Variance(Y) 38884

Std Dev(X)

65.37391

Std Dev(Y)

(a) (b) (c)

(d)

5.9

[Y-E(Y)]^2

(a) (b) (c) (d)

54

197.1903

E (Bond fund, X )  $52.45 E (Stock Fund, Y )  $54  X  $65.37  Y  $197.19 Based on the expected value criteria, you would choose the common stock fund. However, the common stock fund also has a standard deviation more than three times higher than that for the corporate bond fund. An investor should carefully weigh the increased risk. If you chose the common stock fund, you would need to assess your reaction to the small possibility that you could lose virtually all of your entire investment.

0.5997 0.0016 0.0439 0.4018 PHstat output for part (d): Binomial Probabilities Data Sample size

6

Probability of an event of interest

0.83

Copyright ©2024 Pearson Education, Inc.


xxx Chapter 5: Discrete Probability Distributions Statistics Mean

4.98

Variance

0.8466

Standard deviation

0.920109

Binomial Probabilities Table X

P(X)

P(<=X)

0

2.41E-05

2.41E-05

0

0.999976

1

1

0.000707

0.000731

2.41E-05

0.999269

0.999976

2

0.008631

0.009362

0.000731

0.990638

0.999269

3

0.056184

0.065546

0.009362

0.934454

0.990638

4

0.205732

0.271277

0.065546

0.728723

0.934454

5

0.401782

0.67306

0.271277

0.32694

0.728723

6

0.32694

1

0.67306

0

0.32694

Copyright ©2024 Pearson Education, Inc.

P(<X)

P(>X)

P(>=X)


Solutions to End-of-Section and Chapter Review Problems xxxi 5.10

(a) (b) (c) (d)

  4(0.10)  0.40   4  0.1 0.9  0.60   4(0.40)  1.60   4  0.4  0.6  0.98   5(0.80)  0.40   5  0.8  0.2  0.894   3(0.5)  1.50   3  0.5  0.5  0.866

5.11

Given   0.5 and n = 5, P(X = 5) = 0.0312.

5.12

PHStat Output: Binomial Probabilities Data Sample size

6

Probability of an event of interest

0.469

Statistics Mean

2.814

Variance

1.4942

Standard deviation

1.2224

Binomial Probabilities Table X

(a)

P(X) 0

0.0224

1

0.1188

2

0.2623

3

0.3089

4

0.2046

5

0.0723

6

0.0106

P ( X  4)  0.2046 Copyright ©2024 Pearson Education, Inc.


xxxii Chapter 5: Discrete Probability Distributions P( X  6)  0.0106 P ( X  4)  0.2046  0.0723  0.0106  0.2876   2.814   1.2224 That each American adult owns an iPhone or does not own an iPhone and that next six adults selected are independent.

(b) (c) (d) (e)

5.13

PHStat output: Data Sample size

5

Probability of an event of interest

0.25

Statistics Mean Variance Standard deviation

1.25 0.9375 0.968246

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxxiii 5.13

PHStat output:

cont. Binomial Probabilities Table X

P(X)

P(<=X)

P(<X)

P(>X)

0

0.237305

0.237305

0

0.762695

1

1

0.395508

0.632813

0.237305

0.367188

0.762695

2

0.263672

0.896484

0.632813

0.103516

0.367188

3

0.087891

0.984375

0.896484

0.015625

0.103516

4

0.014648

0.999023

0.984375

0.000977

0.015625

5

0.000977

1

0.999023

0

0.000977

If  = 0.25 and n = 5, (a)P(X = 5) = 0.0010 (b)P(X  4) = P(X = 4) + P(X = 5) = 0.0146 + 0.0010 = 0.0156 (c)P(X = 0) = 0.2373 (d)P(X  2) = P(X = 0) + P(X = 1) + P(X = 2) = 0.2373 + 0.3955 + 0.2637 = 0.8965

5.14

PHStat Output: Binomial Probabilities

Data Sample size

10

Probability of an event of interest

0.02

Statistics Mean

0.2 Copyright ©2024 Pearson Education, Inc.

P(>=X)


xxxiv Chapter 5: Discrete Probability Distributions Variance

0.1960

Standard deviation

0.4427

Binomial Probabilities Table X

P(X) 0

0.8171

1

0.1667

2

0.0153

3

0.0008

4

0.0000

5

0.0000

6

0.0000

7

0.0000

8

0.0000

9

0.0000

10

0.0000

P(X<=2)

0.9991

P(X>=3)

0.0009

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxxv P( X  0)  0.8171 P ( X  1)  0.1667 P( X  2)  0.8171  0.1667  0.9991 P( X  3)  0.0009

5.14 cont.

(a) (b) (c) (d)

5.15

Partial PHStat output:

Binomial Probabilities

Data Sample size

20

Probability of an event of interest

0.07

Statistics Mean

1.4

Variance

1.3020

Standard deviation

1.1411

Binomial Probabilities Table X

(a) (b) (c) (d)

P(X)

P(<=X)

P(<X)

P(>X)

P(>=X)

0

0.2342

0.2342

0.0000

0.7658

1.0000

1

0.3526

0.5869

0.2342

0.4131

0.7658

2

0.2521

0.8390

0.5869

0.1610

0.4131

  1.4   1.1411 P( X  0)  0.2342 P( X  1)  0.3526 P( X  2)  0.4131

Copyright ©2024 Pearson Education, Inc.


xxxvi Chapter 5: Discrete Probability Distributions 5.16Partial PHStat output: Binomial Probabilities

Data Sample size

3

Probability of an event of interest

0.905

Statistics Mean

2.715

Variance

0.2579

Standard deviation

0.5079

Binomial Probabilities Table X

(a) (b) (c) (d)

(e)

P(X) 0

0.0009

1

0.0245

2

0.2334

3

0.7412

P( X  3)  0.7412 P( X  0)  0.0009 P ( X  2)  0.2334  0.7412  0.9746   2.715   0.5079 On the average, over the long run, you theoretically expect 2.715 orders to be filled correctly in a sample of 3 orders with a standard deviation of 0.5079. McDonald’s has a slightly higher probability of filling orders correctly, and Wendy’s has a slightly lower probability.

5.17Partial PHStat output: Binomial Probabilities

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxxvii Data Sample size

3

Probability of an event of interest

0.929

Statistics Mean

2.787

Variance

0.1979

Standard deviation

0.4448

Binomial Probabilities Table X

P(X) 0

0.0004

1

0.0140

2

0.1838

3

0.8018

Copyright ©2024 Pearson Education, Inc.


xxxviii Chapter 5: Discrete Probability Distributions 5.17

(a) (b) (c) (d)

(e)

5.18

P( X  3)  0.8018 P( X  0)  0.0004 P ( X  2)  0.1838  0.8018  0.9856   2.787   0.4448 On the average, over the long run, you theoretically expect 2.787 orders to be filled correctly in a sample of 3 orders with a standard deviation of 0.4448. Out of all three fast-food restaurants, McDonald’s has the highest probability of filling orders correctly.

(a)Partial PHStat output:

Poisson Probabilities

Data Average/Expected number of successes:

2.5

Poisson Probabilities Table X 2

P(X)

P(<=X)

P(<X)

P(>X)

P(>=X)

0.256516

0.543813

0.287297

0.456187

0.712703

P(>X)

P(>=X)

Using the equation, if   2.5, P( X  2) 

e2.5  (2.5)2  0.2565 2!

(b)Partial PHStat output:

Poisson Probabilities Data Average/Expected number of successes:

8

Poisson Probabilities Table X

P(X)

P(<=X)

P(<X)

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxxix 8

0.139587

0.592547

0.452961

0.407453

0.547039

If  = 8.0, P(X = 8) = 0.1396 (c) Partial PHStat output:

Poisson Probabilities Data Average/Expected number of successes:

0.5

Poisson Probabilities Table X

P(X)

P(<=X)

P(<X)

P(>X)

P(>=X)

0

0.606531

0.606531

0.000000

0.393469

1.000000

1

0.303265

0.909796

0.606531

0.090204

0.393469

If  = 0.5, P(X = 1) = 0.3033

Copyright ©2024 Pearson Education, Inc.


xl Chapter 5: Discrete Probability Distributions 5.18

(d)Partial PHStat output:

cont. Poisson Probabilities Data Average/Expected number of successes:

3.7

Poisson Probabilities Table X

P(X)

P(<=X)

P(<X)

P(>X)

P(>=X)

0 0.024724 0.024724 0.000000 0.975276 1.000000

If  = 3.7, P(X = 0) = 0.0247

5.19

(a)Partial PHStat output:

Poisson Probabilities Data Mean/Expected number of events of interest: Poisson Probabilities Table X 0 1 2

2

P(X) P(<=X) P(<X) P(>X) P(>=X) 0.135335 0.135335 0.000000 0.864665 1.000000 0.270671 0.406006 0.135335 0.593994 0.864665 0.270671 0.676676 0.406006 0.323324 0.593994

If  = 2.0, P(X  2) = 1 – [P(X = 0) + P(X = 1)] = 1 – [0.1353 + 0.2707] = 0.5940

(b)Partial PHStat output: Poisson Probabilities Data Average/Expected number of successes:

8

Poisson Probabilities Table

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xli X

P(X)

P(<=X)

P(<X)

P(>X)

P(>=X)

0

0.000335

0.000335

0.000000

0.999665

1.000000

1

0.002684

0.003019

0.000335

0.996981

0.999665

2

0.010735

0.013754

0.003019

0.986246

0.996981

3

0.028626

0.042380

0.013754

0.957620

0.986246

If  = 8.0, P(X  3) = 1 – [P(X = 0) + P(X = 1) + P(X = 2)] = 1 – [0.0003 + 0.0027 + 0.0107] = 1 – 0.0137 = 0.9863 (c)Partial PHStat output:

Poisson Probabilities Data Average/Expected number of successes:

0.5

Poisson Probabilities Table X

P(X)

P(<=X)

P(<X)

P(>X)

P(>=X)

0

0.606531

0.606531

0.000000

0.393469

1.000000

1

0.303265

0.909796

0.606531

0.090204

0.393469

If  = 0.5, P(X  1) = P(X = 0) + P(X = 1) = 0.6065 + 0.3033 = 0.9098

Copyright ©2024 Pearson Education, Inc.


xlii Chapter 5: Discrete Probability Distributions 5.19

(d)Partial PHStat output:

cont. Poisson Probabilities

Data Average/Expected number of successes:

4

Poisson Probabilities Table X

P(X)

P(<=X)

P(<X)

P(>X)

P(>=X)

0

0.018316

0.018316

0.000000

0.981684

1.000000

1

0.073263

0.091578

0.018316

0.908422

0.981684

If  = 4.0, P(X  1) = 1 – P(X = 0) = 1 – 0.0183 = 0.9817

(e)Partial PHStat output:

Poisson Probabilities

Data Average/Expected number of successes:

5

Poisson Probabilities Table X

P(X)

P(<=X)

P(<X)

P(>X)

P(>=X)

0

0.006738

0.006738

0.000000

0.993262

1.000000

1

0.033690

0.040428

0.006738

0.959572

0.993262

2

0.084224

0.124652

0.040428

0.875348

0.959572

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xliii 3

0.140374

0.265026

0.124652

If  = 5.0, P(X  3) = P(X = 0) + P(X = 1) + P(X = 2) + P(X = 3) = 0.0067 + 0.0337 + 0.0842 + 0.1404 = 0.2650

Copyright ©2024 Pearson Education, Inc.

0.734974

0.875348


xliv Chapter 5: Discrete Probability Distributions 5.20

PHStat output for (a) – (d)

Poisson Probabilities Table X

P(X)

P(<=X)

P(<X)

P(>X)

P(>=X)

0

0.006738

0.006738

0.000000

0.993262

1.000000

1

0.033690

0.040428

0.006738

0.959572

0.993262

2

0.084224

0.124652

0.040428

0.875348

0.959572

3

0.140374

0.265026

0.124652

0.734974

0.875348

4

0.175467

0.440493

0.265026

0.559507

0.734974

5

0.175467

0.615961

0.440493

0.384039

0.559507

6

0.146223

0.762183

0.615961

0.237817

0.384039

7

0.104445

0.866628

0.762183

0.133372

0.237817

8

0.065278

0.931906

0.866628

0.068094

0.133372

9

0.036266

0.968172

0.931906

0.031828

0.068094

10

0.018133

0.986305

0.968172

0.013695

0.031828

11

0.008242

0.994547

0.986305

0.005453

0.013695

12

0.003434

0.997981

0.994547

0.002019

0.005453

13

0.001321

0.999302

0.997981

0.000698

0.002019

14

0.000472

0.999774

0.999302

0.000226

0.000698

15

0.000157

0.999931

0.999774

0.000069

0.000226

16

0.000049

0.999980

0.999931

0.000020

0.000069

17

0.000014

0.999995

0.999980

0.000005

0.000020

18

0.000004

0.999999

0.999995

0.000001

0.000005

19

0.000001

1.000000

0.999999

0.000000

0.000001

20

0.000000

1.000000

1.000000

0.000000

0.000000

Given  = 5.0, (a) P(X = 1) = 0.0337 (b) P(X < 1) = 0.0067 Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xlv (c) (d)

P(X > 1) = 0.9596 P(X  1) = 0.0404

Copyright ©2024 Pearson Education, Inc.


xlvi Chapter 5: Discrete Probability Distributions 5.21

Portion of PHStat Output:

POISSON.DIST Probabilities

Data Mean/Expected number of events of interest:

12

POISSON.DIST Probabilities Table

(a) (b)

X

P(X)

P(<=X)

P(<X)

P(>X)

P(>=X)

0

0.0000

0.0000

0.0000

1.0000

1.0000

1

0.0001

0.0001

0.0000

0.9999

1.0000

2

0.0004

0.0005

0.0001

0.9995

0.9999

3

0.0018

0.0023

0.0005

0.9977

0.9995

4

0.0053

0.0076

0.0023

0.9924

0.9977

5

0.0127

0.0203

0.0076

0.9797

0.9924

6

0.0255

0.0458

0.0203

0.9542

0.9797

7

0.0437

0.0895

0.0458

0.9105

0.9542

8

0.0655

0.1550

0.0895

0.8450

0.9105

9

0.0874

0.2424

0.1550

0.7576

0.8450

10

0.1048

0.3472

0.2424

0.6528

0.7576

11

0.1144

0.4616

0.3472

0.5384

0.6528

12

0.1144

0.5760

0.4616

0.4240

0.5384

13

0.1056

0.6815

0.5760

0.3185

0.4240

14

0.0905

0.7720

0.6815

0.2280

0.3185

e12  (12)0  0.000006  0 0! e12  (12)10   12, P( X  10)   0.1048 10!

  12, P( X  0) 

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xlvii (c)

P( X  12)  1  P( X  12)  1   P( X  0)  P( X  1)  P( X  2)  ...  P( X  11) 

(d)

 e12  (12)0 e12  (12)1 e12  (12)2 e12  (12)11  1     ...   0! 1! 2! 11!    1   0  0.001  0.004  ...  0.1144  1  0.4616  0.5384 P( X  13)  P( X  0)  P( X  1)  P( X  2)  ...  P( X  12)  e12  (12)0 e12  (12)1 e12  (12)2 e12  (12)12  1     ...   0! 1! 2! 12!    0  0.001  0.004  ...  0.1144  0.6815

Copyright ©2024 Pearson Education, Inc.


xlviii Chapter 5: Discrete Probability Distributions 5.22

(a)–(c) Portion of PHStat output Data Average/Expected number of successes: Poisson Probabilities Table X P(X) P(<=X) 0 0.002479 0.002479 1 0.014873 0.017351 2 0.044618 0.061969 3 0.089235 0.151204 4 0.133853 0.285057 5 (b) 0.445680 0.160623 6 0.160623 0.606303 7 0.137677 0.743980 8 0.103258 0.847237 9 0.068838 0.916076 10 0.041303 0.957379 11 0.022529 0.979908 12 0.011264 0.991173 13 0.005199 0.996372 14 0.002228 0.998600 15 0.000891 0.999491 16 0.000334 0.999825 17 0.000118 0.999943

(a)

6

P(<X) P(>X) P(>=X) 0.000000 0.997521 1.000000 0.002479 0.982649 0.997521 0.017351 0.938031 0.982649 0.061969 0.848796 0.938031 0.151204 0.714943 0.848796 (a) 0.554320 (c) 0.285057 0.714943 0.445680 0.393697 0.554320 0.606303 0.256020 0.393697 0.743980 0.152763 0.256020 0.847237 0.083924 0.152763 0.916076 0.042621 0.083924 0.957379 0.020092 0.042621 0.979908 0.008827 0.020092 0.991173 0.003628 0.008827 0.996372 0.001400 0.003628 0.998600 0.000509 0.001400 0.999491 0.000175 0.000509 0.999825 0.000057 0.000175

P(X < 5) = P(X = 0) + P(X = 1) + P(X = 2) + P(X = 3) + P(X = 4) =

e6  6 

0

e6  6 

2

e6  6 

3

e6  6 

4

++ + + 0! 2! 3! 4! = 0.002479 + 0.014873 + 0.044618 + 0.089235 + 0.133853 = 0.2851

e6  6 

5

(b) (c) (d)

5.23

P(X = 5) =

= 0.1606 5! P(X  5) = 1 – P(X < 5) = 1 – 0.2851 = 0.7149 4 5 e6  6  e6  6  P(X = 4 or X = 5) = P(X = 4) + P(X = 5) = + = 0.2945 4! 5! 1 e6  6  = 1!

Partial PHStat output: Poisson Probabilities Data Average/Expected number of successes: Poisson Probabilities Table X P(X) 3 0.089235

P(<=X) 0.151204

6

P(<X) 0.061969

Copyright ©2024 Pearson Education, Inc.

P(>X) 0.848796

P(>=X) 0.938031


Solutions to End-of-Section and Chapter Review Problems xlix 5.23 cont.

If  = 6.0, P(X  3) = P(X = 0) + P(X = 1) + P(X = 2) + P(X = 3) = 0.0025 + 0.0149 + 0.0446 + 0.0892 = 0.1512 n  P(X  3) = 100  (0.1512) = 15.12, so 15 or 16 cookies will probably be discarded.

5.24

Portion of PHStat Output: POISSON.DIST Probabilities

Data Mean/Expected number of events of interest:

0.45

POISSON.DIST Probabilities Table

5.25

X

P(X)

P(<=X)

P(<X)

P(>X)

P(>=X)

0

0.6376

0.6376

0.0000

0.3624

1.0000

1

0.2869

0.9246

0.6376

0.0754

0.3624

2

0.0646

0.9891

0.9246

0.0109

0.0754

3

0.0097

0.9988

0.9891

0.0012

0.0109

(a) (b)

P( X  0)  0.6376 P ( X  1)  1  P ( X  0)  1  0.6376  0.3624

(c)

P( X  2)  1   P( X  0)  P( X  1)   1  0.6376  0.2869  0.0754

Portion of PHStat Output: POISSON.DIST Probabilities

Data Mean/Expected number of events of interest:

0.99

POISSON.DIST Probabilities Table X

P(X)

P(<=X)

P(<X)

Copyright ©2024 Pearson Education, Inc.

P(>X)

P(>=X)


l Chapter 5: Discrete Probability Distributions 0

0.3716

0.3716

0.0000

0.6284

1.0000

1

0.3679

0.7394

0.3716

0.2606

0.6284

2

0.1821

0.9215

0.7394

0.0785

0.2606

3

0.0601

0.9816

0.9215

0.0184

0.0785

(a) (b)

P( X  0)  0.3716 P ( X  1)  1  P ( X  0)  1  0.3716  0.6284

(c)

P( X  2)  1   P( X  0)  P( X  1)   1  0.3716  0.3679  0.2606

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems li 5.26

Portion of PHStat Output: POISSON.DIST Probabilities

Data Mean/Expected number of events of interest:

4.5

POISSON.DIST Probabilities Table X

P(X)

P(<=X)

P(<X)

P(>X)

P(>=X)

0

0.0111

0.0111

0.0000

0.9889

1.0000

1

0.0500

0.0611

0.0111

0.9389

0.9889

2

0.1125

0.1736

0.0611

0.8264

0.9389

3

0.1687

0.3423

0.1736

0.6577

0.8264

4

0.1898

0.5321

0.3423

0.4679

0.6577

(a) (b)

P ( X  0)  0.0111 P ( X  1)  0.0500

(c) (d)

P( X  1)  1   P( X  0)  P( X  1)  1  0.0111  0.0500  0.9389 P( X  2)  P( X  0)  P( X  1)  0.0111  0.0500  0.0611

5.27Portion of PHStat Output: POISSON.DIST Probabilities

Data Mean/Expected number of events of interest:

1.88

POISSON.DIST Probabilities Table X

P(X)

P(<=X)

P(<X)

Copyright ©2024 Pearson Education, Inc.

P(>X)

P(>=X)


lii Chapter 5: Discrete Probability Distributions 0

0.1526

0.1526

0.0000

0.8474

1.0000

1

0.2869

0.4395

0.1526

0.5605

0.8474

2

0.2697

0.7091

0.4395

0.2909

0.5605

3

0.1690

0.8781

0.7091

0.1219

0.2909

(a)

For the number of problems with 2019 model Ford to be distributed as a Poisson random variable, we need to assume that (i) the probability that a problem occurs in a given Ford is the same for any other new Ford, (ii) the number of problems that a Ford has is independent of the number of problems any other Ford has, (iii) the probability that two or more problems will occur in some area of a Ford approaches zero as the area becomes smaller. Yes, these assumptions are reasonable in this problem. P( X  0)  0.1526 (b) P( X  2)  P ( X  0)  P ( X  1)  P ( X  2)  0.7091 (c) (d) An operational definition for problem can be ―a specific feature in the car that is not performing according to its intended designed function.‖ The operational definition is important in interpreting the initial quality score because different customers can have different expectations of what function a feature is supposed to perform. 5.28Portion of PHStat Output: POISSON.DIST Probabilities Data Mean/Expected number of events of interest:

1.48

POISSON.DIST Probabilities Table

(a) (b) (c)

X

P(X)

P(<=X)

P(<X)

P(>X)

P(>=X)

0

0.2276

0.2276

0.0000

0.7724

1.0000

1

0.3369

0.5645

0.2276

0.4355

0.7724

2

0.2493

0.8139

0.5645

0.1861

0.4355

3

0.1230

0.9368

0.8139

0.0632

0.1861

P( X  0)  0.2276 P( X  2)  P( X  0)  P( X  1)  P( X  2)  0.8139 Because Ford had a higher mean rate of problems per car than Hyundai, the probability of a randomly selected Ford having zero problems and the probability of no more than two problems are both lower than for Hyundai. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems liii

5.29

Partial PHStat output: Poisson Probabilities

Data Mean/Expected number of events of interest:

0.8

Poisson Probabilities Table X

P(X)

P(<=X)

P(<X)

P(>X)

P(>=X)

0

0.449329

0.449329

0.000000

0.550671

1.000000

1

0.359463

0.808792

0.449329

0.191208

0.550671

2

0.143785

0.952577

0.808792

0.047423

0.191208

3

0.038343

0.990920

0.952577

0.009080

0.047423

4

0.007669

0.998589

0.990920

0.001411

0.009080

5

0.001227

0.999816

0.998589

0.000184

0.001411

6

0.000164

0.999979

0.999816

0.000021

0.000184

7

0.000019

0.999998

0.999979

0.000002

0.000021

8

0.000002

1.000000

0.999998

0.000000

0.000002

(a)

For the number of phone calls received in a 1-minute period to be distributed as a Poisson random variable, we need to assume that (i) the probability that a phone call is received in a given 1-minute period is the same for all the other 1-minute periods, (ii) the number of phone calls received in a given 1-minute period is independent of the number of phone calls received in any other 1-minute period, (iii) the probability that two or more phone calls received in a time period approaches zero as the length of the time period becomes smaller.

(b)

  0.8 , P(X = 0) = 0.4493

(c)

  0.8 , P(X  3) = 0.0474

Copyright ©2024 Pearson Education, Inc.


liv Chapter 5: Discrete Probability Distributions (d)

  0.8 , P(X  6) = 0.999979. A maximum of 6 phone calls will be received in a 1minute period 99.99% of the time.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lv 5.30

The expected value is the average of a probability distribution. It is the value that can be expected to occur on the average, in the long run.

5.31

The four properties of a situation that must be present in order to use the binomial distribution are (i) the sample consists of a fixed number of observations, n, (ii) each observation can be classified into one of two mutually exclusive and collectively exhaustive categories, usually called “an event of interest” and “not an event of interest”, (iii) the probability of an observation being classified as “an event of interest”,  , is constant from observation to observation and (iv) the outcome (i.e., “an event of interest” or “not an event of interest”) of any observation is independent of the outcome of any other observation.

5.32

The four properties of a situation that must be present in order to use the Poisson distribution are (i) you are interested in counting the number of times a particular event occurs in a given area of opportunity (defined by time, length, surface area, and so forth), (ii) the probability that an event occurs in a given area of opportunity is the same for all of the areas of opportunity, (iii) the number of events that occur in one area of opportunity is independent of the number of events that occur in other areas of opportunity and (iv) the probability that two or more events will occur in an area of opportunity approaches zero as the area of opportunity becomes smaller.

5.33

(a)

PHStat output:

Covariance Analysis Probabilities & Outcomes:

Statistics E(X) E(Y) Variance(X) Standard Deviation(X) Variance(Y) Standard Deviation(Y) Covariance(XY) Variance(X+Y) Standard Deviation(X+Y)

P X 0.001 -1000000 0.999 4000

Y

2996 0 1.01E+09 31733.39 0 0 0 1.01E+09 31733.39

Copyright ©2024 Pearson Education, Inc.

Calculations Area


lvi Chapter 5: Discrete Probability Distributions The expected value of the profit made by the insurance company is $2996. (b)

On average, the promoter will have to pay $4000 while the insurance company will make a profit of $2996. This is not a win-win opportunity.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lvii 5.34

(a) (b)

0.675 0.675 Excel Output Binomial Probabilities

Data Sample size

5

Probability of an event of interest

0.675

Statistics Mean

3.375

Variance

1.0969

Standard deviation

1.0473

Binomial Probabilities Table X

P(X) 0

0.0036

1

0.0377

2

0.1564

3

0.3248

4

0.3373

5

0.1401

  0.675, n  5 (c) (d) (e)

P ( X  4)  0.3373 P ( X  0)  0.0036

Stock prices tend to rise in the years when the economy is expanding and fall in the years of recession or contraction. Hence, the probability that the price will rise in one year is not independent from year to year.

Copyright ©2024 Pearson Education, Inc.


lviii Chapter 5: Discrete Probability Distributions 5.35

Excel Output

Binomial Probabilities

Data Sample size

10

Probability of an event of interest

0.81

Statistics Mean

8.1

Variance

1.5390

Standard deviation

1.2406

Binomial Probabilities Table X

P(X) 0

0.0000

1

0.0000

2

0.0001

3

0.0006

4

0.0043

5

0.0218

6

0.0773

7

0.1883

8

0.3010

9

0.2852

10

0.1216

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lix (a) (b) (c) (d)

P ( X  8)  0.3010 P ( X  8)  P( X  8)  P( X  9)  P( X  10)  0.7078 P ( X  6)  P ( X  0)  P ( X  1)  ...  P ( X  6)  0.1039

The probability that only three respondents use two or more social media channels is /or 0. If the probability that a retail brand uses two or more social media channels for business is 0.91 it is essentially impossible that only three business in 10 would use two or more social media channels. We might conclude that this geographical area has very limited internet access and it is not appropriate to use the model in this area.

Copyright ©2024 Pearson Education, Inc.


lx Chapter 5: Discrete Probability Distributions 5.36

Excel Output

Binomial Probabilities

Data Sample size

15

Probability of an event of interest

0.5

Statistics Mean

7.5

Variance

3.7500

Standard deviation

1.9365

Binomial Probabilities Table X

P(X) 0

0.0000

1

0.0005

2

0.0032

3

0.0139

4

0.0417

5

0.0916

6

0.1527

7

0.1964

8

0.1964

9

0.1527

10

0.0916

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxi

(a)

11

0.0417

12

0.0139

13

0.0032

14

0.0005

P ( X  12)  P( X  12)  P( X  13)  P( X  14)  0.0175

Copyright ©2024 Pearson Education, Inc.


lxii Chapter 5: Discrete Probability Distributions 5.36

(b)

Excel Output

cont.

Binomial Probabilities

Data Sample size

15

Probability of an event of interest

0.75

Statistics Mean

11.25

Variance

2.8125

Standard deviation

1.6771

Binomial Probabilities Table X

P(X) 0

0.0000

1

0.0000

2

0.0000

3

0.0000

4

0.0001

5

0.0007

6

0.0034

7

0.0131

8

0.0393

9

0.0917

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxiii

(b)

10

0.1651

11

0.2252

12

0.2252

13

0.1559

14

0.0668

15

0.0134

P ( X  12)  P ( X  12)  P ( X  13)  P ( X  14)  P ( X  15)  0.4613

Copyright ©2024 Pearson Education, Inc.


lxiv Chapter 5: Discrete Probability Distributions 5.37

Excel Output:

Binomial Probabilities

Data Sample size

10

Probability of an event of interest

0.8

Statistics Mean

8

Variance

1.6000

Standard deviation

1.2649

Binomial Probabilities Table X

P(X) 0

0.0000

1

0.0000

2

0.0001

3

0.0008

4

0.0055

5

0.0264

6

0.0881

7

0.2013

8

0.3020

9

0.2684

10

0.1074

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxv (a) (b) (c) (d)

P ( X  0)  0.0000 P ( X  5)  0.0264 P ( X  5)  P ( X  6)  P ( X  7)  P( X  8)  P( X  9)  P( X  10)  0.9672   8,   1.2649

Copyright ©2024 Pearson Education, Inc.


lxvi Chapter 5: Discrete Probability Distributions 5.38

Excel Output:

Binomial Probabilities

Data Sample size

10

Probability of an event of interest

0.3

Statistics Mean

3

Variance

2.1000

Standard deviation

1.4491

Binomial Probabilities Table X

P(X) 0

0.0282

1

0.1211

2

0.2335

3

0.2668

4

0.2001

5

0.1029

6

0.0368

7

0.0090

8

0.0014

9

0.0001

10

0.0000

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxvii (a) (b) (c) (d) (e)

P ( X  0)  0.0282 P ( X  5)  0.1029 P ( X  5)  P ( X  6)  P ( X  7)  P( X  8)  P( X  9)  P( X  10)  0.0473   3,   1.4491

Since the percentage of bills containing an error is lower in this problem, the probability is higher in (a) and (b) of this problem and lower in (c).

Copyright ©2024 Pearson Education, Inc.


lxviii Chapter 5: Discrete Probability Distributions 5.39

Excel Output:

Binomial Probabilities

Data Sample size

10

Probability of an event of interest

0.28

Statistics Mean

2.8

Variance

2.0160

Standard deviation

1.4199

Binomial Probabilities Table X

P(X) 0

0.0374

1

0.1456

2

0.2548

3

0.2642

4

0.1798

5

0.0839

6

0.0272

7

0.0060

8

0.0009

9

0.0001

10

0.0000

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxix (a) (b) (c) (d)

P ( X  5)  P ( X  6)  P ( X  7)  P( X  8)  P( X  9)  P( X  10)  0.0342 P ( X  2)  P ( X  0)  P ( X  1)  0.1830 P ( X  0)  0.0374

The assumptions needed are (i) there are only two mutually exclusive and collectively exhaustive outcomes – ―one-word searches‖ or ―not one-word searches,‖ (ii) the probabilities are constant, and (iii) the outcomes are independent.

Copyright ©2024 Pearson Education, Inc.


lxx Chapter 5: Discrete Probability Distributions 5.40

Excel Output:

Binomial Probabilities

Data Sample size

20

Probability of an event of interest

0.62

Statistics Mean

12.4

Variance

4.7120

Standard deviation

2.1707

Binomial Probabilities Table X

P(X) 0

0.0000

1

0.0000

2

0.0000

3

0.0000

4

0.0001

5

0.0007

6

0.0029

7

0.0094

8

0.0249

9

0.0542

10

0.0974

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxxi 11

0.1444

12

0.1767

13

0.1774

14

0.1447

15

0.0945

16

0.0482

17

0.0185

18

0.0050

19

0.0009

20

0.0001

(a)

E ( X )    12.4

(b)

  2.1707

(c)

P ( X  10)  0.0974

(d)

P ( X  5)  P ( X  0)  P ( X  1)  ...  P ( X  5)  0.0009

(e)

P ( X  5)  P ( X  5)  P( X  6)  ...  P ( X  20)  0.9998

Copyright ©2024 Pearson Education, Inc.


lxxii Chapter 5: Discrete Probability Distributions 5.41

Excel Output

Binomial Probabilities

Data Sample size

20

Probability of an event of interest

0.11

Statistics Mean

2.2

Variance

1.9580

Standard deviation

1.3993

Binomial Probabilities Table X

P(X) 0

0.0972

1

0.2403

2

0.2822

3

0.2093

4

0.1099

5

0.0435

6

0.0134

7

0.0033

8

0.0007

9

0.0001

10

0.0000

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxxiii 11

0.0000

12

0.0000

13

0.0000

14

0.0000

15

0.0000

16

0.0000

17

0.0000

18

0.0000

19

0.0000

20

0.0000

(a)

E ( X )    2.2

(b)

  1.3993

(c)

P ( X  0)  0.0972

(d)

P ( X  2)  P ( X  0)  P( X  1)  P( X  2)  0.6198

(e)

P ( X  3)  P ( X  3)  P ( X  4)  ...  P ( X  20)  0.3802

Alternatively, P( X  3)  1  P ( X  2)  1  0.6198  0.3802

Copyright ©2024 Pearson Education, Inc.


lxxiv Chapter 5: Discrete Probability Distributions 5.42

Partial Excel Output: Binomial Probabilities

Data Sample size

47

Probability of an event of interest

0.5

Statistics Mean

23.5

Variance

11.7500

Standard deviation

3.4278

Binomial Probabilities Table X

(a)

P(X) 38

0.0000

39

0.0000

40

0.0000

41

0.0000

42

0.0000

43

0.0000

44

0.0000

45

0.0000

46

0.0000

47

0.0000

  0.50, P ( X  39)  0.0000

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxxv 5.42 (b)Partial Excel Output: cont. Binomial Probabilities

Data Sample size

47

Probability of an event of interest

0.7

Statistics Mean

32.9

Variance

9.8700

Standard deviation

3.1417

Binomial Probabilities Table X

P(X) 38

0.0348

39

0.0188

40

0.0088

41

0.0035

42

0.0012

43

0.0003

44

0.0001

45

0.0000

46

0.0000

47

0.0000

  0.70, P ( X  39)  0.0326

Copyright ©2024 Pearson Education, Inc.


lxxvi Chapter 5: Discrete Probability Distributions 5.42 (c)Partial Excel Output: cont. Binomial Probabilities

Data Sample size

47

Probability of an event of interest

0.9

Statistics Mean

42.3

Variance

4.2300

Standard deviation

2.0567

Binomial Probabilities Table X

P(X) 38

0.0249

39

0.0516

40

0.0930

41

0.1428

42

0.1837

43

0.1922

44

0.1572

45

0.0943

46

0.0369

47

0.0071

  0.90, P ( X  39)  0.9589 (d)

Based on the results in (a)–(c), the probability that the Standard & Poor’s 500 index will increase if there is an early gain in the first five trading days of the year is very likely to Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxxvii be close to 0.90 because that yields a probability of 95.89% that at least 39 of the 47 years the Standard & Poor’s 500 index will increase the entire year.

Copyright ©2024 Pearson Education, Inc.


lxxviii Chapter 5: Discrete Probability Distributions 5.43

Excel Output: Binomial Probabilities

Data Sample size

55

Probability of an event of interest

0.5

Statistics Mean

27.5

Variance

13.7500

Standard deviation

3.7081

Binomial Probabilities Table X

P(X) 37

0.0040

38

0.0019

39

0.0008

40

0.0003

41

0.0001

42

0.0000

43

0.0000

44

0.0000

45

0.0000

46

0.0000

47

0.0000

48

0.0000

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxxix

(a) (b)

49

0.0000

50

0.0000

51

0.0000

52

0.0000

53

0.0000

54

0.0000

55

0.0000

  0.50, P( X  38)  0.0032 It is ludicrous to believe that there is a correlation between the performance of the stock market and the winner of a Super Bowl. If the indicator is a random event, the probability of making a correct prediction 38 or more times out of 50 trials is nearly zero.

Copyright ©2024 Pearson Education, Inc.


lxxx Chapter 5: Discrete Probability Distributions

5.44Portion of PHStat Output:

POISSON.DIST Probabilities

Data Mean/Expected number of events of interest:

3

POISSON.DIST Probabilities Table

(a)

X

P(X)

P(<=X)

P(<X)

P(>X)

P(>=X)

0

0.0498

0.0498

0.0000

0.9502

1.0000

1

0.1494

0.1991

0.0498

0.8009

0.9502

2

0.2240

0.4232

0.1991

0.5768

0.8009

3

0.2240

0.6472

0.4232

0.3528

0.5768

4

0.1680

0.8153

0.6472

0.1847

0.3528

5

0.1008

0.9161

0.8153

0.0839

0.1847

6

0.0504

0.9665

0.9161

0.0335

0.0839

7

0.0216

0.9881

0.9665

0.0119

0.0335

8

0.0081

0.9962

0.9881

0.0038

0.0119

9

0.0027

0.9989

0.9962

0.0011

0.0038

10

0.0008

0.9997

0.9989

0.0003

0.0011

11

0.0002

0.9999

0.9997

0.0001

0.0003

12

0.0001

1.0000

0.9999

0.0000

0.0001

13

0.0000

1.0000

1.0000

0.0000

0.0000

The assumptions needed are (i) the probability that a questionable claim is referred by an investigator is constant, (ii) the probability that a questionable claim is referred by an investigator approaches 0 as the interval gets smaller, and (iii) the probability that a questionable claim is referred by an investigator is independent from interval to interval. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxxxi (b) (c) (d)

P( X  5)  0.1008 P( X  10)  P ( X  0)  P ( X  1)  ...  P ( X  10)  0.9997 P( X  11)  1  P( X  10)  1  0.9997  0.0003

Copyright ©2024 Pearson Education, Inc.



Chapter 6

6.1

PHStat output: Normal Probabilities Common Data Mean

0

Standard Deviation

1

Probability for X<1.47 or X >1.94 P(X<1.47 or X >1.94)

Probability for a Range

Probability for X <= X Value

1.47

From X Value

1.47

Z Value

1.47

To X Value

1.94

Z Value for 1.47

1.47

Z Value for 1.94

1.94

P(X<=1.47)

0.9292

Probability for X >

6.2

0.9554

X Value

1.94

Z Value

1.94

P(X>1.94)

0.0262

P(X<=1.47)

0.9292

P(X<=1.94)

0.9738

P(1.47<=X<=1.94)

0.0446

(a)

P ( Z  1.47)  0.9292

(b)

P ( Z  1.94)  0.0262

(c)

P(1.47  Z  1.94)  0.9738  0.9292  0.0446

(d)

P ( Z  1.47)  P ( Z  1.94)  0.9292  (1  0.9738)  0.9554

PHStat output: Normal Probabilities

Copyright ©2024 Pearson Education, Inc. v


Solutions to End-of-Section and Chapter Review Problems 247 Common Data Mean

0

Standard Deviation

1 Probability for a Range

Probability for X <=

From X Value

1.57

X Value

–1.57

To X Value

1.84

Z Value

–1.57

Z Value for 1.57

1.57

0.0582076

Z Value for 1.84

1.84

P(X<=–1.57)

Probability for X > X Value

1.84

Z Value

1.84

P(X>1.84)

0.0329

P(X<=1.57)

0.9418

P(X<=1.84)

0.9671

P(1.57<=X<=1.84)

0.0253

Find X and Z Given Cum. Pctage. Cumulative Percentage

6.3

84.13%

Probability for X<–1.57 or X >1.84

Z Value

0.999815

P(X<–1.57 or X >1.84)

X Value

0.999815

0.0911

(a)

P(–1.57 < Z < 1.84) = 0.9671 – 0.0582 = 0.9089

(b)

P(Z < –1.57) + P(Z > 1.84) = 0.0582 + 0.0329 = 0.0911

(c)

If P(Z > A) = 0.025, P(Z < A) = 0.975. A = + 1.96.

(d)

If P(–A < Z < A) = 0.6826, P(Z < A) = 0.8413. So 68.26% of the area is captured between – A = –1.00 and A = +1.00.

PHStat output: Normal Probabilities

Standard Deviation

Common Data Mean

1

Probability for X <= 0

X Value

Copyright ©2024 Pearson Education, Inc.

1.18


224 Chapter 6: The Normal Distribution and Other Continuous Distributions Z Value

1.18

P(X<=1.18)

0.8810

Probability for X > X Value

-0.31

Z Value

-0.31

P(X>-0.31)

0.6217

Probability for a Range

Probability for X<1.18 or X >-0.31

From X Value

P(X<1.18 or X >-0.31)

To X Value

1.5027

Z Value for -0.31

6.4

-0.31 0 -0.31

Z Value for 0

0

P(X<=-0.31)

0.3783

P(X<=0)

0.5000

P(-0.31<=X<=0)

0.1217

(a)

P ( Z  1.18)  0.8810

(b)

P ( Z  0.31)  0.6217

(c)

P(0.31  Z  0)  0.5000  0.3783  0.1217

(d)

P ( Z  0.31)  P( Z  1.18)  (1  0.6217)  (1  0.8810)  0.4973

PHStat output:

Normal Probabilities

Copyright ©2024 Pearson Education, Inc.


248 Chapter 6: The Normal Distribution and Other Continuous Distributions Common Data Mean

0

Standard Deviation

1 Probability for a Range

Probability for X <=

From X Value

–1.96

X Value

–0.21

To X Value

–0.21

Z Value

–0.21

Z Value for –1.96

–1.96

0.4168338

Z Value for –0.21

–0.21

P(X<=–1.96)

0.0250

P(X<=–0.21)

0.4168

P(–1.96<=X<=–0.21)

0.3918

P(X<=–0.21)

Probability for X > X Value

1.08

Z Value

1.08

P(X>1.08)

0.1401

Find X and Z Given Cum. Pctage. Cumulative Percentage

6.5

84.13%

Probability for X<–0.21 or X >1.08

Z Value

0.999815

P(X<–0.21 or X >1.08)

X Value

0.999815

0.5569

(a)

P(Z > 1.08) = 1 – 0.8599 = 0.1401

(b)

P(Z < –0.21) = 0.4168

(c)

P(–1.96 < Z < –0.21) = 0.4168 – 0.0250 = 0.3918

(d)

P(Z > A) = 0.1587, P(Z < A) = 0.8413. A = + 1.00.

Partial PHStat output:

Normal Probabilities Common Data Mean

100

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems 249 Standard Deviation

10

Probability for X <= X Value

68

Z Value

-3.2

P(X<=68)

0.0007

Probability for X > X Value

78

Z Value

-2.2

P(X>78)

0.9861

(a)

P ( X  78)  P ( Z  2.20)  0.9861 Z 

(b)

P ( X  68)  P ( Z  3.20)  0.0007 Z 

(c)

Partial PHStat output:

X 

 X 

78  100  2.20 10

68  100  3.20 10

Normal Probabilities Common Data Mean

100

Standard Deviation

10

Probability for X <= X Value

78

Z Value

-2.2

P(X<=78)

0.0139

Probability for X > X Value

100

Copyright ©2024 Pearson Education, Inc.


250 Chapter 6: The Normal Distribution and Other Continuous Distributions Z Value

0

P(X>100)

0.5000

Probability for X<78 or X >100 P(X<78 or X >100)

0.5139

P ( X  78)  P ( X  100)  P ( Z  2.20)  P ( Z  0)  0.0139  0.5  0.5139

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems 251 6.5

(d)

Partial PHStat output:

cont. Find X Values Given a Percentage Percentage

80.00%

Z Value

-1.28

Lower X Value

87.18

Upper X Value

112.82

P( X lower  X  X upper )  0.8 P (1.28  Z )  0.10 Z  1.28 

and P( Z  1.28)  0.90

X lower  100 10

and Z  1.28 

X upper  100 10

X lower  100  1.28(10)  87.20 and X upper  100  1.28(10)  112.80

6.6

(a)

Partial PHStat output:

Common Data Mean

50

Standard Deviation

4 Probability for a Range

Probability for X <=

From X Value

42

X Value

42

To X Value

43

Z Value

–2

Z Value for 42

–2

0.0227501

Z Value for 43

–1.75

P(X<=42)

0.0228

P(X<=43)

0.0401

P(X<=42)

Probability for X >

Copyright ©2024 Pearson Education, Inc.


252 Chapter 6: The Normal Distribution and Other Continuous Distributions X Value

43

Z Value

–1.75

P(X>43)

0.9599

P(42<=X<=43)

Find X and Z Given Cum. Pctage. Cumulative Percentage

Probability for X<42 or X >43 P(X<42 or X >43)

0.9827

(b)

P(X < 42) = P(Z < –2.00) = 0.0228

(c)

P(X < A) = 0.05,

(d)

–1.644854

X Value

43.42059

A  50 A = 50 – 1.645(4) = 43.42 4

Partial PHStat output:

Find X and Z Given Cum. Pctage. Cumulative Percentage

80.00%

Z Value

0.841621

X Value

53.36648

P(Xlower < X < Xupper) = 0.60 P(Z < –0.84) = 0.20 and P(Z < 0.84) = 0.80 Z  0.84 

X lower  50 4

Z  0.84 

X upper  50 4

Xlower = 50 – 0.84(4) = 46.64 and Xupper = 50 + 0.84(4) = 53.36 6.7

  45.2,   10 P ( X  33)  P ( Z  1.22)  0.8888 (a) Probability for X > X Value

33

Z Value

-1.22

5.00%

Z Value

P(X > 43) = P(Z > –1.75) = 1 – 0.0401 = 0.9599

Z  1.645 

0.0173

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems 253 P(X>33) (b)

(c)

(d)

6.8

0.8888

P (10  X  20)  P(3.52  Z  2.52)  0.0057 Probability for a Range

From X Value

10

To X Value

20

Z Value for 10

-3.52

Z Value for 20

-2.52

P(X<=10)

0.0002

P(X<=20)

0.0059

P(10<=X<=20)

0.0057

P ( X  10)  P ( Z  3.52)  0.0002 Probability for X <=

X Value

10

Z Value

-3.52

P(X<=10)

0.0002

A  45.2 A  68.4635 10 Find X and Z Given Cum. Pctage.

P ( X  A)  0.99 Z  2.3263 

Cumulative Percentage

99.00%

Z Value

2.3263

X Value

68.4635

  43,647,   10,000 P(34,000  X  50,000)  P( 0.9647  Z  0.6353)  0.7374  0.1673  0.5700 (a) Probability for a Range From X Value

34000

To X Value

50000

Copyright ©2024 Pearson Education, Inc.


254 Chapter 6: The Normal Distribution and Other Continuous Distributions Z Value for 34000

-0.9647

Z Value for 50000

0.6353

P(X<=34000)

0.1673

P(X<=50000)

0.7374

P(34000<=X<=50000)

0.5700

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems 255 6.8 cont.

(b)

P ( X  30,000)  P ( x  60,000)  P( Z  1.3647)  P ( Z  1.6353)  0.0862  0.00510  0.1372

Probability for X <= X Value

30000

Z Value

-1.3647

P(X<=30000)

0.0862

Probability for X > X Value

60000

Z Value

1.6353

P(X>60000)

0.0510

Probability for X<30000 or X >60000 P(X<30000 or X >60000)

(c)

(d) (a)

0.1372

A  43,647 A  53,063.2123 12,000 Find X and Z Given Cum. Pctage.

P ( X  A)  0.80 Z  0.8416 

Cumulative Percentage

80.00%

Z Value

0.8416

X Value

52063.2123

  43,647,   12,000 P (34,000  X  50,000)  P (0.8039  Z  0.5294)  0.7017  0.2107  0.4910 Probability for a Range

From X Value

34000

To X Value

50000

Z Value for 34000

-0.803917

Z Value for 50000

0.5294167

Copyright ©2024 Pearson Education, Inc.


256 Chapter 6: The Normal Distribution and Other Continuous Distributions P(X<=34000)

0.2107

P(X<=50000)

0.7017

P(34000<=X<=50000)

0.4910

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems 257 6.8 cont.

(d)

(b) P ( X  30,000)  P ( x  60,000)  P ( Z  1.137)  P ( Z  1.363)  0.1277  0.0865  0.2142 Probability for X <= X Value

30000

Z Value

-1.13725

P(X<=30000)

0.1277

Probability for X > X Value

60000

Z Value

1.36275

P(X>60000)

0.0865

Probability for X<30000 or X >60000 P(X<30000 or X >60000)

(c)

6.9

0.2142

A  43,647 A  53,746.4548 10,000 Find X and Z Given Cum. Pctage.

P ( X  A)  0.80 Z  0.8416 

Cumulative Percentage

80.00%

Z Value

0.8416

X Value

53746.4548

  139.33,   25 P ( X  100)  P( Z  1.5732)  0.9422 (a) Probability for X >

(b)

X Value

100

Z Value

-1.5732

P(X>100)

0.9422

P(100  X  200)  P (1.5732  Z  2.4268)  0.9924  0.0578  0.9345

Copyright ©2024 Pearson Education, Inc.


258 Chapter 6: The Normal Distribution and Other Continuous Distributions Probability for a Range From X Value

100

To X Value

200

Z Value for 100

-1.5732

Z Value for 200

2.4268

P(X<=100)

0.0578

P(X<=200)

0.9924

P(100<=X<=200)

0.9345

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems 259 6.9

P( X lower  X  X upper )  0.95

(c)

P ( 1.96  Z )  0.0250

cont.

Z  1.96 

and P( Z  1.96)  0.975

X lower  139.33 25

and Z  1.96 

X upper  139.33 25

X lower  139.33  1.96(25)  90.33 and X upper  139.33  1.96(25)  188.33

Find X Values Given a Percentage Percentage

6.10

95.00%

Z Value

-1.96

Lower X Value

90.33

Upper X Value

188.33

PHStat output: Common Data Mean

73

Standard Deviation

8 Probability for a Range

Probability for X <=

From X Value

65

To X Value

89

X Value

91

Z Value

2.25

Z Value for 65

–1

0.9877755

Z Value for 89

2

P(X<=91)

Probability for X > X Value

81

Z Value

1

P(X>81)

0.1587

P(X<=65)

0.1587

P(X<=89)

0.9772

P(65<=X<=89)

0.8186

Find X and Z Given Cum. Pctage. Cumulative Percentage

Copyright ©2024 Pearson Education, Inc.

95.00%


260 Chapter 6: The Normal Distribution and Other Continuous Distributions Probability for X<91 or X >81 P(X<91 or X >81)

(a) (b) (c)

(d)

1.1464

Z Value

1.644854

X Value

86.15883

P(X < 91) = P(Z < 2.25) = 0.9878 P(65 < X < 89) = P(– 1.00 < Z < 2.00) = 0.9772 – 0.1587 = 0.8185 P(X > A) = 0.05P(Z < 1.645) = 0.9500 A  73 A = 73 + 1.645(8) = 86.16% Z  1.645  8 Option 1: P(X > A) = 0.10 P(Z < 1.28)  0.9000 81  73 Z  1.00 8 Since your score of 81% on this exam represents a Z-score of 1.00, which is below the minimum Z-score of 1.28, you will not earn an ―A‖ grade on the exam under this grading option. 68  62 Option 2: Z   2.00 3 Since your score of 68% on this exam represents a Z-score of 2.00, which is well above the minimum Z-score of 1.28, you will earn an ―A‖ grade on the exam under this grading option. You should prefer Option 2.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems 261 6.11

(a)

(b)

  37.1,   12 , P( X  33)  P( Z  0.34166)  0.6337 Probability for X > X Value

33

Z Value

-0.341667

P(X>33)

0.6337

P (20  X  30)  P( 1.425  Z  0.592)  0.2770  0.0771  0.2000

Probability for a Range

(c)

(d)

(e)

From X Value

20

To X Value

30

Z Value for 20

-1.425

Z Value for 30

-0.591667

P(X<=20)

0.0771

P(X<=30)

0.2770

P(20<=X<=30)

0.2000

P( X  20)  P( Z  1.425)  0.0771 Probability for X <=

X Value

20

Z Value

-1.425

P(X<=20)

0.0771

A  37.1 A  65.0162 12 Find X and Z Given Cum. Pctage.

P ( X  A)  0.99 Z  2.3263 

Cumulative Percentage

99.00%

Z Value

2.3263

X Value

65.0162

The per capita consumption of bottled water in Germany is much lower than the per capita consumption of bottled water in the United States. Copyright ©2024 Pearson Education, Inc.


262 Chapter 6: The Normal Distribution and Other Continuous Distributions

6.12

(a)

(b)

  39.6,   8 , P( X  50)  P( Z  1.3)  0.0968 Probability for X > X Value

50

Z Value

1.3

P(X>50)

0.0968

P (25  X  40)  P(1.825  Z  0.05)  0.5199  0.0340  0.4859

Probability for a Range

6.12 cont.

(c)

From X Value

25

To X Value

40

Z Value for 25

-1.825

Z Value for 40

0.05

P(X<=25)

0.0340

P(X<=40)

0.5199

P(25<=X<=40)

0.4859

P ( X  10)  P ( Z  3.7)  0.0001

Probability for X <=

(d)

X Value

10

Z Value

-3.7

P(X<=10)

0.0001

A  39.6 A  58.2108 8 Find X and Z Given Cum. Pctage.

P ( X  A)  0.99 Z  2.3263 

Cumulative Percentage

99.00%

Z Value

2.3263

X Value

58.2108

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems 263

6.13

(a)

Partial PHStat output:

Probability for a Range From X Value

21.99

To X Value

22

Z Value for 21.99

–2.4

Z Value for 22

–0.4

P(X<=21.99)

0.0082

P(X<=22)

0.3446

P(21.99<=X<=22)

0.3364

P(21.99 < X < 22.00) = P(– 2.4 < Z < – 0.4) = 0.3364 (b)

Partial PHStat output:

Probability for a Range From X Value

21.99

To X Value

22.01

Z Value for 21.99

–2.4

Z Value for 22.01

1.6

P(X<=21.99)

0.0082

P(X<=22.01)

0.9452

P(21.99<=X<=22.01)

0.9370

P(21.99 < X < 22.01) = P(–2.4 < Z < 1.6) = 0.9370 (c)

Partial PHStat output:

Find X and Z Given Cum. Pctage.

Copyright ©2024 Pearson Education, Inc.


264 Chapter 6: The Normal Distribution and Other Continuous Distributions Cumulative Percentage

98.00%

Z Value

2.05375

X Value

22.0123

P(X > A) = 0.02

Z = 2.05

A = 22.0123

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems 265 6.13

(d)

(a) Partial PHStat output:

cont.

Probability for a Range From X Value

21.99

To X Value

22

Z Value for 21.99

–3

Z Value for 22

–0.5

P(X<=21.99)

0.0013

P(X<=22)

0.3085

P(21.99<=X<=22)

0.3072

P(21.99 < X < 22.00) = P(– 3.0 < Z < – 0.5) = 0.3072 (b) Partial PHStat output:

Probability for a Range From X Value

21.99

To X Value

22.01

Z Value for 21.99

–3

Z Value for 22.01

2

P(X<=21.99)

0.0013

P(X<=22.01)

0.9772

P(21.99<=X<=22.01)

0.9759

P(21.99 < X < 22.01) = P(– 3.0 < Z < 2) = 0.9759 (c) Partial PHStat output:

Find X and Z Given Cum. Pctage.

Copyright ©2024 Pearson Education, Inc.


266 Chapter 6: The Normal Distribution and Other Continuous Distributions Cumulative Percentage

98.00%

Z Value

2.05375

X Value

22.0102

P(X > A) = 0.02

Z = 2.05

A = 22.0102

6.14

With 39 values, the smallest of the standard normal quantile values covers an area under the normal curve of 0.025. The corresponding Z value is –1.96. The middle (20th) value has a cumulative area of 0.50 and a corresponding Z value of 0.0. The largest of the standard normal quantile values covers an area under the normal curve of 0.975, and its corresponding Z value is +1.96.

6.15

Area under normal curve covered: 0.1429 0.2857 0.4286 0.5714 0.7143 0.8571 Standardized normal quantile value: –1.07 –0.57 –0.18 +0.18 +0.57 +1.07

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems 267 6.16

(a)

Excel output: Before Halftime Ad Ratings Descriptive Summary

Rating

Five-Number Summary

Mean

5.789286

Minimum

4.5

Median

5.8

First Quartile

5.3

Mode

5.3

Median

5.8

Minimum

4.5

Third Quartile

6.4

Maximum

7.4

Maximum

7.4

Range

2.9

IQR

1.1

Variance

0.4928

Standard Deviation

0.7020

1.33S

0.933698

Coeff. of Variation

12.13%

6*Std dev

4.212171

Skewness

0.2392

Kurtosis

-0.5132

Count

28

Standard Error

0.1327

Copyright ©2024 Pearson Education, Inc.


268 Chapter 6: The Normal Distribution and Other Continuous Distributions

Super Bowl Ad Ratings First and Second Period: (a) Mean = 5.7893, median = 5.8, S = 0.7020, range = 2.9, 6S = 4.212, interquartile range = 1.1, 1.33(0.7020) = 0.9337. The mean is approximately the same as the median. The range is much less than 6S, and the interquartile range is more than 1.33S. The skewness statistic is 0.2392, indicating a symmetric distribution, and the kurtosis statistic is –0.5132, indicating a platykurtic distribution. 6.16 cont.

(a)

Excel output: Halftime and Afterwards Ad Ratings

Descriptive Summary

Mean Median Mode Minimum Maximum Range Variance Standard Deviation Coeff. of Variation Skewness

Rating 5.534483 5.6 5.3 4 7.3 3.3 0.5031 0.7093 12.82% 0.1578

Five-Number Summary Minimum First Quartile Median Third Quartile Maximum IQR

4 5.05 5.6 5.95 7.3 0.9

1.33S 6*Std dev

0.94332 4.255579

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems 269 Kurtosis Count Standard Error

0.5515 29 0.1317

Super Bowl Ad Ratings Halftime and Afterward: (a) Mean = 5.5345, median = 5.6, S = 0.7093, range = 3.3, 6S = 4.2556, interquartile range = 0.9, 1.33(0.7093) = 0.9433. The mean is approximately the same as the median. The range is much less than 6S, and the interquartile range is close to 1.33S. The skewness statistic is 0.1578, indicating an approximately symmetric distribution, and the kurtosis statistic is 0.5515, indicating a platykurtic distribution.

Copyright ©2024 Pearson Education, Inc.


270 Chapter 6: The Normal Distribution and Other Continuous Distributions 6.16

(b)

cont.

The data for the halftime and afterwards appear to follow a normal distribution.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems 271

Copyright ©2024 Pearson Education, Inc.


272 Chapter 6: The Normal Distribution and Other Continuous Distributions 6.17

(a)Excel Ouput:

Descriptive Summary Team Value

Payroll

Wins

Mean

2.074

147.2133333

80.96666667

Median

1.73

141.555

81

Mode

#N/A

#N/A

77

Minimum

0.99

48.06

52

Maximum

6

284.73

107

Range

5.01

236.67

55

Variance

1.2997

3869.0060

209.3437

Standard Deviation

1.1401

62.2013

14.4687

Coeff. of Variation

54.97%

42.25%

17.87%

Skewness

1.8739

0.3460

-0.2659

Kurtosis

3.8080

-0.4260

-0.4135

Count

30

30

30

Standard Error

0.2081

11.3564

2.6416

6*std dev

6.8404

373.2080

86.8123

Minimum

0.99

48.06

52

First Quartile

1.32

94.93

73

Median

1.73

141.555

81

Third Quartile

2.3

184.63

92

Maximum

6

284.73

107

Five-Number Summary

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems 273 IQR

0.98

89.7

19

1.33*S

1.5163

82.7278

19.2434

6.17 (a) cont.

Copyright ©2024 Pearson Education, Inc.


274 Chapter 6: The Normal Distribution and Other Continuous Distributions

Team Value: Mean = 2.074, median = 1.73, S = 1.1401, range = 5.01, 6S = 8.8404, interquartile range = 0.98, 1.33(1.1401) = 1.5163. The mean is greater than the median. The range is much less than 6S, and the interquartile range is much less than 1.33S. The skewness statistic is 1.8739, and the kurtosis statistic is 3.8080. Payroll: Mean = 147.2133, median = 141.555, S = 62.2013, range = 236.67, 6S = 373.2080, interquartile range = 89.7, 1.33(62.2013) = 82.7278. The mean is greater the median. The range is much less than 6S, and the interquartile range is more than 1.33S. The skewness statistic is 0.3460, indicating a symmetric distribution, and the kurtosis statistic is –0.4260, indicating a platykurtic distribution. Wins: Mean = 80.967, median = 81, S = 14.4687, range = 55, 6S = 86.8123, interquartile range = 19, 1.33(14.4687) = 19.2434. The mean is approximately the same as the median. The range is much less than 6S, and the interquartile range is approximately the same as 1.33S. The skewness statistic is –0.2659, and the kurtosis statistic is – 0.4135, indicating a platykurtic distribution. 6.17

(b)

cont.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems 275

6.18

(a)

(b)Excel Output:

Copyright ©2024 Pearson Education, Inc.


276 Chapter 6: The Normal Distribution and Other Continuous Distributions

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems 277 6.18

(a)

(b)

cont.

The mean is greater than the median. The range is much less than 6S, and the IQR is more than 1.33S. The box plot is right skewed. The normal probability plot along with the skewness and kurtosis statistics indicate a departure from the normal distribution. 6.19

(a) Descriptive Summary

Market Cap

Five-Number Summary

Mean

397.16

Minimum

38.96

Median

217.48

First Quartile

123.49

Mode

#N/A

Median

217.475

Minimum

38.96

Third Quartile

404.34

Maximum

2941.00

Maximum

2941

Range

2902.04

IQR

280.85

Variance

422032.2302

Copyright ©2024 Pearson Education, Inc.


278 Chapter 6: The Normal Distribution and Other Continuous Distributions Standard Deviation

649.6401

1.33S

864.0213

Coeff. of Variation

163.57%

6*Std dev

3897.84

Skewness

3.4450

Kurtosis

11.4249

Count

30

Standard Error

118.6075

The range is much less than 6S and the IQR is less than 1.33S, the mean is larger than the median, the normal probability plot appears right skewed, the histogram appears right-skewed and both the skewness and kurtosis statistics indicate a departure from a normal distribution. 6.19

(b)

cont.

(c)

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems 279

Copyright ©2024 Pearson Education, Inc.


280 Chapter 6: The Normal Distribution and Other Continuous Distributions Excel output: Error –0.00023

Mean Median

0

Mode

0

Standard Deviation

0.001696

Sample Variance

2.88E-06

Range

0.008

Minimum

–0.003

Maximum

0.005

First Quartile

–0.0015

Third Quartile

0.001

1.33 Std Dev

0.002255

Interquartile Range 6 Std Dev

(a)

0.0025 0.010175

Because the interquartile range is close to 1.33S and the range is also close to 6S, the data appear to be approximately normally distributed.

(b) Normal Probability Plot 0.006 0.005 0.004 0.003

Error

6.20

0.002 0.001 0 -0.001 -3

-2

-1

0

1

2

3

-0.002 -0.003 -0.004

Z Value

The normal probability plot suggests that the data appear to be approximately normally distributed.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems 281 6.21

Excel Output: Descriptive Summary

One-Year CD

Five-Year CD

Mean

0.18

0.38

Median

0.12

0.30

Mode

0.1

0.2

Minimum

0.02

0.02

Maximum

0.55

1.00

Range

0.53

0.98

Variance

0.0246

0.0763

Standard Deviation

0.1570

0.2762

Coeff. of Variation

89.12%

72.62%

Skewness

1.4615

0.6486

Kurtosis

1.3764

–0.2609

Count

36

36

Standard Error

0.0262

0.0460

6*Std dev

0.94174913

1.65697401

1.33*S

0.20875439

0.3672959

Minimum

0.02

0.02

First Quartile

0.05

0.15

Median

0.115

0.3

Third Quartile

0.25

0.55

Five-Number Summary

Copyright ©2024 Pearson Education, Inc.


282 Chapter 6: The Normal Distribution and Other Continuous Distributions Maximum

0.55

1

Interquartile Range

0.2

0.4

(a)

For the One-year CD the mean is larger than the median; the range is smaller than 6 times the standard deviation, and the interquartile range is smaller than 1.33 times the standard deviation. The data do not appear to be normally distributed. For the Five-Year CD the mean is larger than the median; the range is smaller than 6 times the standard deviation, and the interquartile range is larger than 1.33 times the standard deviation. The data appear to deviate from the normal distribution.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems 283 6.21 cont.

(b)

Copyright ©2024 Pearson Education, Inc.


284 Chapter 6: The Normal Distribution and Other Continuous Distributions 6.22

(a)

Excel Output

Descriptive Summary

Electricity

Five-Number Summary

Mean

110.254902

Minimum

73

Median

108

First Quartile

95

Mode

95

Median

108

Minimum

73

Third Quartile

123

Maximum

160

Maximum

160

Range

87

IQR

28

Variance

321.9937

Standard Deviation

17.9442

6*Std deviation

107.6651

Coeff. of Variation

16.28%

1.33*S

23.8658

Skewness

0.4685

Kurtosis

0.2788

Count

51

Standard Error

2.5127

The mean is close to the median. The five-number summary suggests that the distribution is quite symmetrical around the median. The interquartile range is much more than 1.33 times the standard deviation. The range is about $20 below 6 times the standard deviation. In general, the distribution of the data appears to closely resemble a normal distribution. (b)

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems 285

The normal probability plot confirms that the data appear to be approximately normally distributed.

Copyright ©2024 Pearson Education, Inc.


286 Chapter 6: The Normal Distribution and Other Continuous Distributions

6.23

(a) (b) (c)

75  0.2 10 32 P(2  X  3)   0.1 10 0  10  5 2 P(5  X  7) 

10  0   2.8868 2

6.24

(d)



(a)

P (0  X  20) 

(b) (c)

6.25

0  100  50 2



(a)

P(25  X  30) 

(c)

(a) (b) (c)

6.27

20  0  0.20 100 30  10 P(10  X  30)   0.20 100 100  35 P(35  X  100)   0.65 100

(d)

(b)

6.26

12

20  60   40 2

2



12

30  23  0.70 30  20 30  25 P(25  X  30)   0.50 30  20 25  20 P(20  X  25)   0.50 30  20

(a)

P(59  X  70) 

(d)

 60  20   11.5470

P(23  X  30) 

(d)

(c)

12

30  25  0.125 60  20 35  20 P(20  X  35)   0.375 60  20

20  30   25 2

(b)

100  0   31.623 2



 30  20   2.8668 2



12

70  59  0.6875 75  59 70  65 P(65  X  70)   0.3125 75  59 75  65 P(65  X  75)   0.6250 75  59 75  59   67 2

 75  59   4.6188 2



12

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems 287

6.28

Using Table E.2, first find the cumulative area up to the larger value, and then subtract the cumulative area up to the smaller value.

6.29

Find the Z value corresponding to the given percentile and then use the equation X    z .

6.30

The normal distribution is bell-shaped; its measures of central tendency are all equal; its middle 50% is within 1.33 standard deviations of its mean; and 99.7% of its values are contained within three standard deviations of its mean.

6.31

Both the normal distribution and the uniform distribution are symmetric but the uniform distribution has a bounded range while the normal distribution ranges from negative infinity to positive infinity. The exponential distribution is right-skewed and ranges from zero to infinity.

6.32

If the distribution is normal, the plot of the Z values on the horizontal axis and the original values on the vertical axis will be a straight line.

6.33

(a)

Partial PHStat output:

Probability for a Range From X Value

0.75

To X Value

0.753

Z Value for 0.75

–0.75

Z Value for 0.753

0

P(X<=0.75)

0.2266

P(X<=0.753)

0.5000

P(0.75<=X<=0.753)

0.2734

P(0.75 < X < 0.753) = P(– 0.75 < Z < 0) = 0.2734 (b)

Partial PHStat output: Copyright ©2024 Pearson Education, Inc.


288 Chapter 6: The Normal Distribution and Other Continuous Distributions

Probability for a Range

From X Value

0.74

To X Value

0.75

Z Value for 0.74

–3.25

Z Value for 0.75

–0.75

P(X<=0.74)

0.0006

P(X<=0.75)

0.2266

P(0.74<=X<=0.75)

0.2261

P(0.74 < X < 0.75) = P(– 3.25 < Z < – 0.75) = 0.2266 – 0.00058 = 0.2260

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems 289 6.33

(c)Partial PHStat output:

cont. Probability for X > X Value

0.76

Z Value

1.75

P(X>0.76)

0.0401

P(X > 0.76) = P(Z > 1.75) = 1.0 – 0.9599 = 0.0401 (d)

Partial PHStat output:

Probability for X <= X Value

0.74

Z Value

–3.25

P(X<=0.74)

0.000577

P(X < 0.74) = P(Z < – 3.25) = 0.00058 (e)

Partial PHStat output:

Find X and Z Given Cum. Pctage. Cumulative Percentage

7.00%

Z Value

–1.475791

X Value

0.747097

P(X < A) = P(Z < – 1.48) = 0.07

6.34

(a)

A = 0.753 – 1.48(0.004) = 0.7471

Partial PHStat output:

Probability for a Range

Copyright ©2024 Pearson Education, Inc.


290 Chapter 6: The Normal Distribution and Other Continuous Distributions From X Value

1.9

To X Value

2

Z Value for 1.9

–2

Z Value for 2

0

P(X<=1.9)

0.0228

P(X<=2)

0.5000

P(1.9<=X<=2)

0.4772

P(1.90 < X < 2.00) = P(– 2.00 < Z < 0) = 0.4772 (b)

Partial PHStat output:

Probability for a Range From X Value

1.9

To X Value

2.1

Z Value for 1.9

–2

Z Value for 2.1

2

P(X<=1.9)

0.0228

P(X<=2.1)

0.9772

P(1.9<=X<=2.1)

0.9545

P(1.90 < X < 2.10) = P(– 2.00 < Z < 2.00) = 0.9772 – 0.0228 = 0.9544 (c)

Partial PHStat output:

Probability for X<1.9 or X >2.1 P(X<1.9 or X >2.1)

0.0455

P(X < 1.90) + P(X > 2.10) = 1 – P(1.90 < X < 2.10) = 0.0456

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems 291 6.34

(d)

Partial PHStat output:

cont. Find X and Z Given Cum. Pctage. Cumulative Percentage

1.00%

Z Value

–2.326348

X Value

1.883683

P(X > A) = P( Z > – 2.33) = 0.99A = 2.00 – 2.33(0.05) = 1.8835 (e)

Partial PHStat output:

Find X and Z Given Cum. Pctage. Cumulative Percentage

99.50%

Z Value

2.575829

X Value

2.128791

P(A < X < B) = P(– 2.58 < Z < 2.58) = 0.99 A = 2.00 – 2.58(0.05) = 1.8710

6.35

(a)

B = 2.00 + 2.58(0.05) = 2.1290

Partial PHStat output:

Probability for a Range From X Value

1.9

To X Value

2

Z Value for 1.9

–2.4

Z Value for 2

–0.4

P(X<=1.9)

0.0082

P(X<=2)

0.3446

P(1.9<=X<=2)

0.3364

Copyright ©2024 Pearson Education, Inc.


292 Chapter 6: The Normal Distribution and Other Continuous Distributions P(1.90 < X < 2.00) = P(– 2.40 < Z < – 0.40) = 0.3446 – 0.0082 = 0.3364 (b)

Partial PHStat output:

Probability for a Range From X Value

1.9

To X Value

2.1

Z Value for 1.9

–2.4

Z Value for 2.1

1.6

P(X<=1.9)

0.0082

P(X<=2.1)

0.9452

P(1.9<=X<=2.1)

0.9370

P(1.90 < X < 2.10) = P(– 2.40 < Z < 1.60) = 0.9452 – 0.0082 = 0.9370 (c)

Partial PHStat output:

Probability for a Range From X Value

1.9

To X Value

2.1

Z Value for 1.9

–2.4

Z Value for 2.1

1.6

P(X<=1.9)

0.0082

P(X<=2.1)

0.9452

P(1.9<=X<=2.1)

0.9370

P(X < 1.90) + P(X > 2.10) = 1 – P(1.90 < X < 2.10) = 0.0630

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems 293 6.35

(d)

Partial PHStat output:

cont. Find X and Z Given Cum. Pctage. Cumulative Percentage

1.00%

Z Value

–2.326348

X Value

1.903683

P(X > A) = P(Z > – 2.33) = 0.99 (e)

A = 2.02 – 2.33(0.05) = 1.9035

Partial PHStat output:

Find X and Z Given Cum. Pctage. Cumulative Percentage

99.50%

Z Value

2.575829

X Value

2.148791

P(A < X < B) = P(– 2.58 < Z < 2.58) = 0.99 A = 2.02 – 2.58(0.05) = 1.8910

6.36

(a)

B = 2.02 + 2.58(0.05) = 2.1490

Partial PHStat output:

Probability for X <= X Value Z Value P(X<=210)

210 -2 0.0228

P(X < 210) = P(Z < –2) = 0.0228 (b)

Copyright ©2024 Pearson Education, Inc.


294 Chapter 6: The Normal Distribution and Other Continuous Distributions

Probability for a Range From X Value 270 To X Value 300 Z Value for 270 1 Z Value for 300 2.5 P(X<=270) 0.8413 P(X<=300) 0.9938 P(270<=X<=300) 0.1524

P(270 < X < 300) = P(1.0 < Z < 2.5) = 0.1524 (c)

Find X and Z Given Cum. Pctage. Cumulative Percentage 90.00% Z Value 1.2816 X Value 275.6310

P(X < A) = P(Z < 1.2816) = 0.90A = 250 + 20(1.2816) = $275.63 (d)

Find X Values Given a Percentage Percentage 80.00% Z Value -1.28 Lower X Value 224.37 Upper X Value 275.63

6.36

(d)

cont.

P(A < X < B) = P(– 1.2816 < Z < 1.2816) = 0.80 A = 250 – 1.28(500) = $224.37 B = 250 + 1.28(500) = $275.63

6.37

Excel Output: Descriptive Summary

Mean Median Mode

Alcohol 5.269490446 4.92 4.2

Calories 155.656051 151 110

Carbohydrates 12.05171975 12 12

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems 295 Minimum Maximum Range Variance Standard Deviation Coeff. of Variation Skewness Kurtosis Count Standard Error

2.4 11.5 9.1 1.8344 1.3544 25.70% 1.8405 4.5814 157 0.1081

55 330 275 1917.4707 43.7889 28.13% 1.2061 2.9620 157 3.4947

6*Std deviation 1.33*S

8.126298607 262.7336 29.88544064 1.801329525 58.2392814 6.624606008

Minimum First Quartile Median Third Quartile Maximum IQR

2.4 4.4 4.92 5.65 11.5 1.25

55 130.5 151 170.5 330 40

1.9 32.1 30.2 24.8094 4.9809 41.33% 0.4912 1.0811 157 0.3975

1.9 8.65 12 14.7 32.1 6.05

Alcohol %: The mean is greater than the median; the range is larger than 6 times the standard deviation and the interquartile range is smaller than 1.33 times the standard deviation. The data appear to deviate from the normal distribution.

Copyright ©2024 Pearson Education, Inc.


296 Chapter 6: The Normal Distribution and Other Continuous Distributions 6.37 cont.

The normal probability plot suggests that data are not normally distributed. The kurtosis is 4.582 indicating a distribution that is more peaked than a normal distribution, with more values in the tails. The skewness of 1.841 suggests that the distribution is right-skewed.

Calories: The mean is greater than the median; the range is greater than 6 times the standard deviation and the interquartile range is smaller than 1.33 times the standard deviation. The data appear to deviate away from the normal distribution.

The normal probability plot suggests that the data are somewhat right-skewed. The kurtosis is 2.9620 indicating a distribution that is more peaked than a normal distribution, with more values in the tails. The skewness of 1.2061 suggests that the distribution is right-skewed. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems 297 6.37 cont.

Carbohydrates: The mean is approximately equal to the median; the range is approximately equal to 6 times the standard deviation and the interquartile range is slightly smaller than 1.33 times the standard deviation. The data appear to be normally distributed.

The normal probability plot suggests that the data are approximately normally distributed. The kurtosis is 1.0811 indicating a distribution that is slightly more peaked than a normal distribution, with more values in the tails. The skewness of 0.4912 indicates that the distribution deviates slightly from the normal distribution. Waiting time will more closely resemble an exponential distribution. Seating time will more closely resemble a normal distribution. Histogram 60

100.00%

50

80.00%

40

60.00%

30 40.00%

20

Frequency Cumulative %

38

30

0.00%

22

0

14

20.00%

6

10

---

(a) (b) (c)

Frequency

6.38

Midpoints

Copyright ©2024 Pearson Education, Inc.


298 Chapter 6: The Normal Distribution and Other Continuous Distributions (c) Normal Probability Plot 45 40 35 Waiting

30 25 20 15 10 5 0 -3

-2

-1

0

1

2

3

Z Value

Both the histogram and normal probability plot suggest that waiting time more closely resembles an exponential distribution. (d)

30

100.00%

25

80.00%

20

60.00%

Frequency Cumulative %

15 40.00%

10

67

59

0.00%

51

0

43

20.00%

35

5

---

Frequency

Histogram

Midpoints

Normal Probability Plot 80 70 60 Seating

6.38 cont.

50 40 30 20 10 0 -3

-2

-1

0

1

2

3

Z Value

Both the histogram and normal probability plot suggest that seating time more closely resembles a normal distribution. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems 299 6.39   26.9,   20 (a)

(b)

(c)

(d)

P ( X  0)  P ( Z  1.345)  0.9107 Probability for X >

X Value

0

Z Value

-1.345

P(X>0)

0.9107

P( X  10)  P( Z  0.845)  0.8009 Probability for X >

X Value

10

Z Value

-0.845

P(X>10)

0.8009

P ( X  20)  P ( Z  0.345)  0.3650 Probability for X <=

X Value

20

Z Value

-0.345

P(X<=20)

0.3650

P ( X  30)  P ( Z  0.155)  0.5616

Probability for X <= X Value

30

Z Value

0.155

P(X<=30)

0.5616

(e)   0.6,   30 (a) P ( X  0)  P( Z  0.02)  0.5080 Probability for X > X Value

0

Copyright ©2024 Pearson Education, Inc.


300 Chapter 6: The Normal Distribution and Other Continuous Distributions Z Value

-0.02

P(X>0)

0.5080

(b) P ( X  10)  P ( Z  0.3133)  0.3770 Probability for X > X Value

10

Z Value

0.3133333

P(X>10)

0.3770

(c) P ( X  20)  P ( Z  0.6467)  0.7411 Probability for X <= X Value

20

Z Value

0.6466667

P(X<=20)

0.7411

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems 301 6.39 cont.

(e)

(d) P ( X  30)  P ( Z  0.98)  0.8365 Probability for X <=

(f)

X Value

30

Z Value

0.98

P(X<=30)

0.8365

The probability that a S&P 500 stock gained value in 2021 is 0.9107. The probability that a NASDAQ stock gained value in 2021 is 0.5080. The probability that a S&P 500 stock gained 10% or more value in 2021 is 0.8009. The probability that a NASDAQ stock gained 10% or more value in 2021 is 0.3770. The probability that a S&P 500 stock lost 20% or more value in 2021 is 0.3650. The probability that a NASDAQ stock lost 20% or more value in 2021 is 0.7411. The probability that a S&P 500 stock lost 30% or more value in 2021 is 0.5616. The probability that a NASDAQ stock lost 30% or more value in 2021 is 0.8365. The larger standard deviation of the NASDAQ is associated with higher risk.

6.40   33,100,   5,000 (a)

P ( X  25,000)  P ( Z  1.62)  0.0526

Probability for X <= X Value

25000

Z Value

-1.62

P(X<=25000) (b)

0.0526

P(25,000  X  40,000)  P( 1.62  Z  1.38)  0.9162  0.0526  0.8636 Probability for a Range

From X Value

25000

To X Value

40000

Z Value for 25000

-1.62

Z Value for 40000

1.38

P(X<=25000)

0.0526

P(X<=40000)

0.9162

P(25000<=X<=40000)

0.8636

Copyright ©2024 Pearson Education, Inc.


302 Chapter 6: The Normal Distribution and Other Continuous Distributions (c)

P ( X  40,000)  P ( Z  1.38)  0.0838 Probability for X >

X Value

40000

Z Value

1.38

P(X>40000) (d)

0.0838

A  33,100 A  $21, 468.2606 5,000 Find X and Z Given Cum. Pctage.

P ( X  A)  0.99 Z  2.3263 

Cumulative Percentage

1.00%

Z Value

-2.3263

X Value

21468.2606

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems 303 6.40

P( X lower  X  X upper )  0.95

(e) P (1.96  Z )  0.025

cont.

Z  1.96 

X lower  33,100 5,000

and

P ( Z  1.96)  0.975

and

Z  1.96 

X upper  33,100 5,000

X lower  33,100  1.96(5000)  $23,300.18 and X upper  33,100  1.96(5000)  $42,899.82

Find X Values Given a Percentage Percentage

95.00%

Z Value

(f)

-1.96

Lower X Value

23300.18

Upper X Value

42899.82

P(20,000  X  25,000) 

25,000  20,000  0.1667 50,000  20,000

(g) P(25,000  X  40,000) 

40,000  25,000  0.50 50,000  20,000

(h) P(40,000  X  50,000) 

50,000  40,000  0.3333 50,000  20,000

6.41   47,793,   5,000 (a)

(b)

P ( X  40,000)  P ( Z  1.5586)  0.0595 Probability for X <=

X Value

40000

Z Value

-1.5586

P(X<=40000)

0.0595

P(40,000  X  60,000)  P( 1.5586  Z  2.4414)  0.9927  0.0595  0.9331

Probability for a Range

Copyright ©2024 Pearson Education, Inc.


304 Chapter 6: The Normal Distribution and Other Continuous Distributions

(c)

From X Value

40000

To X Value

60000

Z Value for 40000

-1.5586

Z Value for 60000

2.4414

P(X<=40000)

0.0595

P(X<=60000)

0.9927

P(40000<=X<=60000)

0.9331

P ( X  60,000)  P ( Z  2.4414)  0.0073 Probability for X >

X Value

60000

Z Value

2.4414

P(X>60000)

0.0073

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems 305

(d)

A  47,793 A  $36,161.2606 5,000 Find X and Z Given Cum. Pctage.

P ( X  A)  0.01 Z  2.3263 

Cumulative Percentage

1.00%

Z Value

-2.3263

X Value

36161.2606

P( X lower  X  X upper )  0.95

(e) P (1.96  Z )  0.025

Z  1.96 

X lower  47,793 5,000

and

P ( Z  1.96)  0.975

and

Z  1.96 

X upper  47,793 5,000

X lower  47,793  1.96(5000)  $37,993.18 and X upper  47,793  1.96(5000)  $57,592.82

Find X Values Given a Percentage Percentage

95.00%

Z Value

(f)

-1.96

Lower X Value

37993.18

Upper X Value

57592.82

P(40,000  X  45,000) 

45,000  40,000  0.25 60,000  40,000

(g) P(40,000  X  50,000) 

50,000  40,000  0.50 60,000  40,000

(h) P(45,000  X  60,000) 

60,000  45,000  0.75 60,000  40,000

6.42

Class project solutions may vary.

Copyright ©2024 Pearson Education, Inc.



Chapter 7

7.1

 X    10  2. n

25

PHstat output:

Common Data Mean

100

Standard Deviation

2 Probability for a Range

Probability for X <=

From X Value

95

X Value

97.5

To X Value

97.5

Z Value

–1.25

Z Value for 95

–2.5

P(X<=97.5)

0.1056

Z Value for 97.5

–1.25

P(X<=95)

0.0062

P(X<=97.5)

0.1056

P(95<=X<=97.5)

0.0994

Probability for X > X Value

101.7

Z Value

0.85

P(X>101.7)

0.1977

Find X and Z Given Cum. Pctage. Cumulative Percentage

25.00%

Probability for X<97.5 or X >101.7

Z Value

–0.6745

P(X<97.5 or X>101.7)

X Value

98.6510

0.3033

P( X < 97.5) = P(Z < –1.25) = 0.1056 P(95 < X < 97.5) = P(–2.5 < Z < –1.25) = 0.1056 – 0.0062 = 0.0994 P( X > 101.7) = P(Z > 0.85) = 1.0 – 0.8023 = 0.1977  10  X = 100 – 0.675  (d) P( X > A) = P(Z > – 0.675) = 0.75  = 98.65  25   X    5  0.5. PHStat output: n 100 (a) (b) (c)

7.2

Common Data Copyright ©2024 Pearson Education, Inc. v


vi Chapter 7: Sampling Distributions Mean

50

Standard Deviation

0.5 Probability for a Range

Probability for X <=

From X Value

47 49.5

X Value

47

To X Value

Z Value

–6

Z Value for 47

–6

Z Value for 49.5

–1

P(X<=47)

9.866E-10

Probability for X > X Value

51.5

Z Value

3

P(X>51.5)

0.0013

P(X<=47)

0.0000

P(X<=49.5)

0.1587

P(47<=X<=49.5)

0.1587

Find X and Z Given Cum. Pctage. Cumulative Percentage

65.00%

Probability for X<47 or X >51.5

Z Value

0.38532

P(X<47 or X >51.5)

X Value

50.19266

0.0013

Probability for X > X Value

51.1

Z Value

2.2

P(X>51.1)

0.0139

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems vii 7.2 cont.

(a) (b) (c) (d)

P( X < 47) = P(Z < – 6.00) = virtually zero P(47 < X < 49.5) = P(– 6.00 < Z < – 1.00) = 0.1587 – 0.00 = 0.1587 P( X > 51.1) = P(Z > 2.20) = 1.0 – 0.9861 = 0.0139 X = 50 + 0.39(0.5) = 50.195 P( X > A) = P(Z > 0.39) = 0.35

7.3

(a)

For samples of 25 customer receipts for a supermarket for a year, the sampling distribution of sample means is the distribution of means from all possible samples of 25 customer receipts for a supermarket for that year. For samples of 25 insurance payouts in a particular geographical area in a year, the sampling distribution of sample means is the distribution of means from all possible samples of 25 insurance payouts in that particular geographical area in that year. For samples of 25 Call Center logs of inbound calls tracking handling time for a credit card company during the year, the sampling distribution of sample means is the distribution of means from all possible samples of 25 Call Center logs of inbound calls tracking handling time for a credit card company during that year.

(b)

(c)

7.4

(a)

Sampling Distribution of the Mean for n = 2 (without replacement) Sample Number

Outcomes

Sample Means X i

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

1, 3 1, 6 1, 7 1, 9 1, 10 3, 6 3, 7 3, 9 3, 10 6, 7 6, 9 6, 10 7, 9 7, 10 9, 10

X 1 = 2.5 X 2 = 3.5 X 3 = 4.5 X 4 = 5.5 X 5 = 5.5

X 6 = 4.5 X 7 = 5.5 X 8 = 6.5 X 9 = 6.5 X 10 = 6.5 X 11 = 7.5 X 12 = 8.5 X 13 = 8.5 X 14 = 8.5 X 15 = 9.5

Mean of All Possible Sample Means: Mean of All Population Elements: 90 1  3  6  7  9  10 X   6   6 6 15 Both means are equal to 6. This property is called unbiasedness.

Copyright ©2024 Pearson Education, Inc.


viii Chapter 7: Sampling Distributions 7.4 cont.

(b)

Sampling Distribution of the Mean for n = 3 (without replacement)

Sample Number

Outcomes

Sample Means X i

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

1, 3, 6 1, 3, 7 1, 3, 9 1, 3, 10 1, 6, 7 1, 6, 9 1, 6, 10 3, 6, 7 3, 6, 9 3, 6, 10 6, 7, 9 6, 7, 10 6, 9, 10 7, 9, 10 1, 7, 9 1, 7, 10 1, 9, 10 3, 7, 9 3, 7, 10 3, 9, 10

X 1 = 3 1/3 X 2 = 3 2/3 X 3 = 4 1/3 X 4 = 4 2/3

X 5 = 4 2/3 X 6 = 5 1/3 X 7 = 5 2/3 X 8 = 5 1/3 X 9 = 6 1/3 X 10 = 6 1/3 X 11 = 7 1/3 X 12 = 7 2/3 X 13 = 8 1/3 X 14 = 8 2/3 X 15 = 5 2/3 X 16 = 6 1/3

X 17 = 6 2/3 X 18 = 6 1/3 X 19 = 6 2/3 X 20 = 7 1/3

120  6 This is equal to  , the population mean. 20 The distribution for n = 3 has less variability. The larger sample size has resulted in sample means being closer to  . (a) Sampling Distribution of the Mean for n = 2 (with replacement)

X 

(c) (d)

Sample Number

Outcomes

Sample Means X i

1 2 3 4 5 6 7 8 9

1, 1 1, 3 1, 6 1, 7 1, 9 1, 10 3,1 3, 3 3, 6

X 1 = 1.5

X 2 = 2.5 X 3 = 3.5 X 4 = 4.5 X 5 = 5.5 X 6 = 5.5 X 7 = 2.5 X 8 = 3.5 X 9 = 4.5

(table continues on next page)

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ix 7.4 cont.

(d)

(a)

(b)

(c)

Sample Number

Outcomes

Sample Means X i

10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36

3, 7 3, 9 3, 10 6, 1 6, 3 6, 6 6, 7 6, 9 6, 10 7, 1 7,3 7, 6 7, 7 7, 9 7, 10 9, 1 9, 3 9, 6 9, 7 9, 9 9, 10 10, 1 10, 3 10, 6 10, 7 10, 9 10, 10

X 10 = 5.5

X 11 = 6.5 X 12 = 6.5 X 13 = 3.5 X 14 = 4.5 X 15 = 6.5 X 16 = 6.5 X 17 = 7.5

X 18 = 8.5 X 19 = 4.5 X 20 = 5.5 X 21 = 6.5 X 22 = 7.5

X 23 = 8.5 X 24 = 8.5 X 25 = 5.5 X 26 = 6.5 X 27 = 7.5 X 28 = 8.5 X 29 = 9.5

X 30 = 9.5 X 31 = 5.5 X 32 = 6.5 X 33 = 8.5 X 34 = 8.5 X 35 = 9.5 X 36 = 10.

Mean of All Possible Mean of All Sample Means: Population Elements: 216 1  3  6  7  7  12 X  6  6 36 6 Both means are equal to 6. This property is called unbiasedness. Repeat the same process for the sampling distribution of the mean for n = 3 (with replacement). There will be 63  216 different samples.  X  6 This is equal to  , the population mean. The distribution for n = 3 has less variability. The larger sample size has resulted in more sample means being close to  . Copyright ©2024 Pearson Education, Inc.


x Chapter 7: Sampling Distributions 7.5

(a)

P(X < 2.03) = P(Z < –0.4) = 0.3446 Excel Output: Mean

2.04

Standard Deviation

0.025

Probability for X <= X Value

2.03

Z Value

-0.4

P(X<=2.03) (b)

0.3446

Because the amount of water in a two-liter bottle is approximately normally distributed, the sampling distribution of samples of 4 will also be approximately normal with a mean   0.025  0.0125. of  X    2.04 and  X  n 4

P( X  2.03) = P(Z < –0.8) = 0.2119 Excel Output: Mean

2.04

Standard Deviation

0.0125

Probability for X <= X Value

2.03

Z Value

-0.8

P(X<=2.03) (c)

0.2119

Because the amount of water in a two-liter bottle is approximately normally distributed, the sampling distribution of samples of 25 will also be approximately normal with a mean   0.025  0.005. of  X    2.05 and  X  n 25

P( X  2.03) = P(Z < –2) = 0.0228 Excel Output: Mean

2.04

Standard Deviation

0.005

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xi Probability for X <= X Value

2.03

Z Value

-2

P(X<=2.03) (d)

(e)

0.0228

(a) refers to the amount of water in an individual two-liter bottle while (c) refers to the mean amount of water in a sample of 25 two-liter water bottles. There is a 34.46% chance that an individual water bottle will contain less than 2.03 liters but a 2.28% chance that the mean amount of water in 25 water bottles will be less than 2.03 liters. Increasing the sample size from four to 25 reduced the probability that the mean amount of water will be less than 2.03 liters from 21.19% to 2.28%.

Copyright ©2024 Pearson Education, Inc.


xii Chapter 7: Sampling Distributions 7.6

(a)

P(X < 42.035) = P(Z < –0.6) = 0.2743 Excel Output:

(b)

Because the weight of an energy bar is approximately normally distributed, the sampling distribution of samples of 4 will also be approximately normal with a mean of

 X    42.05 and  X 

n

 0.0125.

P( X  42.035) = P(Z < – 1.2) = 0.1151 Excel Output:

(c)

Because the weight of an energy bar is approximately normally distributed, the sampling distribution of samples of 25 will also be approximately normal with a mean of

 X    42.05 and  X 

n

 0.005.

P( X  42.035) = P(Z < – 3) = 0.0013 Excel Output:

(d)

(a) refers to an individual energy bar while (c) refers to the mean of a sample of 25 energy bars. There is a 27.43% chance that an individual energy bar will have a weight below 42.05 grams but only a chance of 0.135% that a mean of 25 energy bars will have a weight below 42.05 grams. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xiii 7.6 cont.

(e)

Increasing the sample size from four to 25 reduced the probability the mean will have a weight below 42.05 grams from 11.51% to 0.135%.

7.7

(a)

Because the population diameter of tennis balls is approximately normally distributed, the sampling distribution of samples of 9 will also be approximately normal with a mean of  X   = 2.63 and  X    0.04  0.01333 . n 25 P( X < 2.61) = P(Z < –1.50) = 0.0668

(b)

Probability for X <= X Value

2.61

Z Value

–1.500375

P(X<=2.61)

(c)

0.0668

P(2.62 < X < 2.64) = P(–0.75 < Z < 0.75) = 0.5469 Probability for a Range

(d)

From X Value

2.62

To X Value

2.64

Z Value for 2.62

–0.750188

Z Value for 2.64

0.7501875

P(X<=2.62)

0.2266

P(X<=2.64)

0.7734

P(2.62<=X<=2.64))

0.5469

P(A < X < B) = P( 1.000 < Z < 1.000) = 0.68 Find X and Z Given Cum. Pctage. Cumulative 20.00% Percentage Z Value –0.8416 X Value 2.6188

Find X and Z Given Cum. Pctage. Cumulative 80.00% Percentage Z Value 0.8416 X Value 2.6412

Lower bound: X = 2.6188 Upper bound: X = 2.6412 Copyright ©2024 Pearson Education, Inc.


xiv Chapter 7: Sampling Distributions 7.8

(a)

(b)

When n = 4 , the shape of the sampling distribution of X should closely resemble the shape of the distribution of the population from which the sample is selected. Because the mean is larger than the median, the distribution of the sales price of new houses is skewed to the right, and so is the sampling distribution of X although it will be less skewed than the population. If you select samples of n = 100, the shape of the sampling distribution of the sample mean will be very close to a normal distribution with a mean of $423,300 and a standard $90,000 error of the mean of  X    = $9,000. 100 n

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xv 7.8 cont.

(c)

P( X < 510,000) = P(Z < 9.6333) = 0.8323 Excel Output: Mean

423300

Standard Deviation

90000

Probability for X <= X Value

510000

Z Value

0.9633333

P(X<=510000) (d)

0.8323

P(470,000 < X < 480,000) = P(5.1889 < Z < 6.3) = 0.0376 Excel Output: Probability for a Range

7.9

(a)

From X Value

470000

To X Value

480000

Z Value for 470000

0.5188889

Z Value for 480000

0.63

P(X<=470000)

0.6981

P(X<=480000)

0.7357

P(470000<=X<=480000)

0.0376

Because the number of apps used per month by smartphone owners is assumed to be normally distributed, the sampling distribution of samples of 25 will also be   8  1.6. approximately normal with a mean of  X    30 and  X  n 25 P(29 < X < 31) = P(–0.625 < Z < 0.625) = 0.4680 Excel Output: Common Data Mean

30

Standard Deviation

1.6

Copyright ©2024 Pearson Education, Inc.


xvi Chapter 7: Sampling Distributions

Probability for a Range From X Value

29

To X Value

31

Z Value for 29

-0.625

Z Value for 31

0.625

P(X<=29)

0.2660

P(X<=31)

0.7340

P(29<=X<=31)

0.4680

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xvii 7.9 cont.

(b)

P(28 < X < 32) = P(–1.25 < Z < 1.25) = 0.7887 Excel Output: Mean

30

Standard Deviation

1.6

Probability for a Range

(c)

From X Value

28

To X Value

32

Z Value for 28

-1.25

Z Value for 32

1.25

P(X<=28)

0.1056

P(X<=32)

0.8944

P(28<=X<=32)

0.7887

Because the number of apps used per month by smartphone owners is assumed to be normally distributed, the sampling distribution of samples of 100 will also be  8 approximately normal with a mean of  X    30 and  X    0.8. n 100 P(29 < X < 31) = P(–1.25 < Z < 1.25) = 0.7887 Excel Output: Mean

30

Standard Deviation

0.8

Probability for a Range From X Value

29

To X Value

31

Z Value for 29

-1.25

Z Value for 31

1.25

P(X<=29)

0.1056

Copyright ©2024 Pearson Education, Inc.


xviii Chapter 7: Sampling Distributions

(d)

P(X<=31)

0.8944

P(29<=X<=31)

0.7887

With the sample size increasing from n = 25 to n = 100, more sample means will be closer to the distribution mean. The standard error of the sampling distribution of size 100 is much smaller than that of size 25, so the likelihood that the sample mean will fall within 1 apps of the mean is much higher for samples of size 100 (probability = 0.7887) than for samples of size 25 (probability = 0.4680).

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xix 7.10

(a)

 X    149 and  X    30  7.5

n 16 P( X > 130) = P(Z > –2.53) = 0.9944 Excel Output: Mean

149

Standard Deviation

7.5

Probability for X >

(b)

X Value

130

Z Value

-2.533333

P(X>130)

0.9944

P( X < A) = P(Z < 1.0364) = 0.85 X = 149 + 1.0364 (7.5) = 156.773 Excel Output: Mean

149

Standard Deviation

7.5

Find X and Z Given Cum. Pctage.

(c) (d)

Cumulative Percentage

85.00%

Z Value

1.0364

X Value

156.7733

To be able to use the standardized normal distribution as an approximation for the area under the curve, you must assume that the population is approximately symmetrical. P( X < A) = P(Z < 1.04) = 0.85 X = 149 + 1.0364 (3.75) = 152.8866 Excel Output: Mean

149

Standard Deviation

3.75

Find X and Z Given Cum. Pctage. Cumulative Percentage

85.00%

Copyright ©2024 Pearson Education, Inc.


xx Chapter 7: Sampling Distributions

7.11

(a) (b)

7.12

(a) (b)

7.13

7.14

Z Value

1.0364

X Value

152.8866

55  0.6875 80 0.70(0.30) = 0.0512 p  80 p

20  0.40 50  0.45 0.55 = 0.0704 p  50 p

(a) p = 16/40 = 0.4

0.30(0.70) = 0.0725 40

(b)

p =

(a)

 p    0.501 ,  p 

 1    n

0.5011  0.501  0.05 100

Partial PHstat output: Probability for X > X Value

0.55

Z Value

0.98

P(X>0.55)

0.1635

P(p > 0.55) = P (Z > 0.98) = 1 – 0.8365 = 0.1635 (b)

 p    0.60 ,  p 

 1    n

0.6 1  0.6  100

 0.04899

Partial PHstat output: Probability for X > X Value

0.55

Z Value

–1.020621

P(X>0.55)

0.8463

P(p > 0.55) = P (Z > – 1.021) = 1 – 0.1539 = 0.8461 Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxi (c)

 p    0.49 ,  p 

 1    n

0.49 1  0.49  100

 0.05

Partial PHstat output: Probability for X > X Value

0.55

Z Value

1.2002401

P(X>0.55)

(d)

0.1150

P(p > 0.55) = P (Z > 1.20) = 1 – 0.8849 = 0.1151 Increasing the sample size by a factor of 4 decreases the standard error by a factor of 2. (a) Partial PHstat output: Probability for X > X Value

0.55

Z Value

1.9600039

P(X>0.55)

0.0250

P(p > 0.55) = P (Z > 1.96) = 1 – 0.9750 = 0.0250 (b) Partial PHstat output: Probability for X > X Value

0.55

Z Value

–2.041241

P(X>0.55)

0.9794

P(p > 0.55) = P (Z > – 2.04) = 1 – 0.0207 = 0.9793

Copyright ©2024 Pearson Education, Inc.


xxii Chapter 7: Sampling Distributions 7.14

(d)

(c)Partial PHstat output:

cont. Probability for X > X Value

0.55

Z Value

2.4004801

P(X>0.55)

0.0082

P(p > 0.55) = P (Z > 2.40) = 1 – 0.9918 = 0.0082 If the sample size is increased to 400, the probably in (a), (b) and (c) is smaller, larger, and smaller, respectively because the standard error of the sampling distribution of the sample proportion becomes smaller and, hence, the sampling distribution is more concentrated around the true population proportion.

7.15

(a)

 p    0.50,  p 

 (1   ) n

0.50(1  0.50) = 0.035355339 200

Partial PHstat output: Probability for a Range

(b)

From X Value

0.5

To X Value

0.55

Z Value for 0.5

0

Z Value for 0.55

1.4142136

P(X<=0.5)

0.5000

P(X<=0.6)

0.9214

P(0.5<=X<=0.6)

0.4214

P(0.50 < p < 0.55) = P(0 < Z < 1.41) = 0.4214 Partial PHstat output: Find X and Z Given Cum. Pctage. Cumulative Percentage

90.00%

Z Value

1.2816

X Value

0.5453

P(–1.2816 < Z < 1.2816) = 0.90 p = 0.50 – 1.2816(0.03536) = 0.4547p = 0.50 + 1.2816(0.03536) = 0.5453 Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxiii (c)

Partial PHstat output: Probability for X > X Value

0.70

Z Value

5.6568543

P(X>0.65)

(d)

0.0000

P(p > 0.70) = P (Z > 5.66) = virtually zero Partial PHstat output: Probability for X > X Value

0.6

Z Value

2.8284271

P(X>0.6)

0.0023

Copyright ©2024 Pearson Education, Inc.


xxiv Chapter 7: Sampling Distributions 7.15 cont.

(d)

If n = 200, P(p > 0.60) = P (Z > 2.83) = 1.0 – 0.9977 = 0.0023 Probability for X > X Value

0.55

Z Value

3.1622777

P(X>0.55)

0.00078

If n = 1000, P(p > 0.55) = P (Z > 3.16) = 1.0 – 0.99921 = 0.00079 More than 60% correct in a sample of 200 is more likely than more than 55% correct in a sample of 1000. 7.16

(a)

 (1   )

0.33(1  0.33) = 0.047 n 100 P(p < 0.3) = P(Z < –0.638) = 0.2616 Excel Output:

 p    0.33,  p 

Mean

0.33

Standard Deviation

0.047

Probability for X <=

(b)

X Value

0.3

Z Value

-0.638298

P(X<=0.3)

0.2616

P(0.3 < p < 0.4) = P(–0.638 < Z < 1.489) = 0.0.6702 Excel Output: Probability for a Range From X Value

0.3

To X Value

0.4

Z Value for 0.3

-0.638298

Z Value for 0.4

1.4893617

P(X<=0.3)

0.2616

P(X<=0.4)

0.9318

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxv P(0.3<=X<=0.4) (c)

0.6702

P(p > 0.4) = P(Z > 1.489) = 0.0682 Excel Output: Probability for X >

(d)

X Value

0.4

Z Value

1.4893617

P(X>0.4)

0.0682

 (1   )

0.33(1  0.33) = 0.0235 n 400 (a)P(p < 0.3) = P(Z < –1.2766) = 0.1009 (b)P(0.3 < p < 0.4) = P(–1.2766 < Z < 2.9787) = 0.0.8977 (c)P(p > 0.4) = P(Z > 2.9787) = 0.0014

 p    0.33,  p 

Copyright ©2024 Pearson Education, Inc.


xxvi Chapter 7: Sampling Distributions 7.16 cont.

(d)

Excel Output: Mean

0.33

Standard Deviation

0.0235

Probability for X <= X Value

0.3

Z Value

-1.276596

P(X<=0.3)

0.1009

Probability for X > X Value

0.4

Z Value

2.9787234

P(X>0.4)

0.0014

Probability for X<0.3 or X >0.4 P(X<0.3 or X >0.4)

0.1023

Probability for a Range From X Value

0.3

To X Value

0.4

Z Value for 0.3

-1.276596

Z Value for 0.4

2.9787234

P(X<=0.3)

0.1009

P(X<=0.4)

0.9986

P(0.3<=X<=0.4)

0.8977

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxvii 7.17

 p    0.66,  p 

 (1   ) n

0.66(1  0.66) = 0.0474 100

Excel Output: Mean

0.66

Standard Deviation

0.0474

Probability for X <= X Value

0.6

Z Value

-1.265823

P(X<=0.6)

0.1028

Probability for X > X Value

0.7

Z Value

0.8438819

P(X>0.7)

0.1994

Copyright ©2024 Pearson Education, Inc.


xxviii Chapter 7: Sampling Distributions 7.17 cont.

Excel Output: Probability for a Range From X Value

0.6

To X Value

0.7

Z Value for 0.6

-1.265823

Z Value for 0.7

0.8438819

P(X<=0.6)

0.1028

P(X<=0.7)

0.8006

P(0.6<=X<=0.7)

0.6978

(a) (b) (c) (d)

P(p < 0.60) = P(Z < –1.266) = 0.1028 P(0.60 < p < 0.70) = P(–1.266 < Z < 0.844) = 0.6978 P(p > 0.70) = P(Z > 0.844) = 0.1994  (1   ) 0.66(1  0.66)  p    0.33,  p   = 0.0237 n 400 (a)P(p < 0.60) = P(Z < –2.532) = 0.0057 (b)P(0.60 < p < 0.70) = P(–2.532 < Z < 1.688) = 0.9486 (c)P(p > 0.70) = P(Z > 1.688) = 0.0457 Excel Output: Mean

0.66

Standard Deviation

0.0237

Probability for X <= X Value

0.6

Z Value

-2.531646

P(X<=0.6)

0.0057

Probability for X > X Value

0.7

Z Value

1.6877637

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxix P(X>0.7)

0.0457

Probability for a Range From X Value

0.6

To X Value

0.7

Z Value for 0.6

-2.531646

Z Value for 0.7

1.6877637

P(X<=0.6)

0.0057

P(X<=0.7)

0.9543

P(0.6<=X<=0.7)

0.9486

Copyright ©2024 Pearson Education, Inc.


xxx Chapter 7: Sampling Distributions 7.18

(a)

 (1   )

0.75(1  0.75) = 0.0306 n 200 P(0.70 < p < 0.8) = P(–1.634 < Z < 1.634) = 0.8977 Excel Output:

 p    0.75,  p 

Mean

0.75

Standard Deviation

0.0306

Probability for a Range

(b)

From X Value

0.7

To X Value

0.8

Z Value for 0.7

-1.633987

Z Value for 0.8

1.6339869

P(X<=0.7)

0.0511

P(X<=0.8)

0.9489

P(0.7<=X<=0.8)

0.8977

The probability is 90% that the sample percentage will be contained between 0.70 and 0.80. Excel Output: Find X Values Given a Percentage Percentage

(c)

90.00%

Z Value

-1.64

Lower X Value

0.70

Upper X Value

0.80

The probability is 95% that the sample percentage will be contained between 0.69 and 0.81. Excel Output: Mean

0.75

Standard Deviation

0.0306

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxxi Find X Values Given a Percentage Percentage

95.00%

Z Value

-1.96

Lower X Value

0.69

Upper X Value

0.81

Copyright ©2024 Pearson Education, Inc.


xxxii Chapter 7: Sampling Distributions 7.19

(a)

 (1   )

0.45(1  0.45) = 0.0497 n 100 P(0.40 < p < 0.50) = P(–1.006036 < Z < 1.0060362) = 0.6856 Excel Output:

 p    0.45,  p 

Mean

0.45

Standard Deviation

0.0497

Probability for a Range

(b)

From X Value

0.4

To X Value

0.5

Z Value for 0.4

-1.006036

Z Value for 0.5

1.0060362

P(X<=0.4)

0.1572

P(X<=0.5)

0.8428

P(0.4<=X<=0.5)

0.6856

The probability is 90% that the sample percentage will be contained between 0.37 and 0.53. Excel Output: Find X Values Given a Percentage Percentage

(c)

90.00%

Z Value

-1.64

Lower X Value

0.37

Upper X Value

0.53

The probability is 95% that the sample percentage will be contained between 0.35 and 0.55. Excel Output: Find X Values Given a Percentage Percentage Z Value

95.00% -1.96 Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxxiii Lower X Value

0.35

Upper X Value

0.55

Copyright ©2024 Pearson Education, Inc.


xxxiv Chapter 7: Sampling Distributions 7.19 cont.

(d)

 (1   )

0.45(1  0.45) = 0.0249 n 400 P(0.40 < p < 0.50) = P(–2.008032 < Z < 2.0080321) = 0.9554 Excel Output:

 p    0.45,  p 

Mean

0.45

Standard Deviation

0.0249

Probability for a Range From X Value

0.4

To X Value

0.5

Z Value for 0.4

-2.008032

Z Value for 0.5

2.0080321

P(X<=0.4)

0.0223

P(X<=0.5)

0.9777

P(0.4<=X<=0.5)

0.9554

The probability is 90% that the sample percentage will be contained between 0.41 and 0.49. Excel Output: Find X Values Given a Percentage Percentage

90.00%

Z Value

-1.64

Lower X Value

0.41

Upper X Value

0.49

The probability is 95% that the sample percentage will be contained between 0.40 and 0.50. Find X Values Given a Percentage Percentage Z Value

95.00% -1.96

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxxv Lower X Value

0.40

Upper X Value

0.50

Copyright ©2024 Pearson Education, Inc.


xxxvi Chapter 7: Sampling Distributions 7.20

(a)

 (1   )

0.6208(1  0.6208) = 0.0343 n 200 P(0.70 < p < 0.78) = P(2.3090 < Z < 4.6414) = 0.0105

 p    0.8(0.776)  0.6208 ,  p  Mean

0.6208

Standard Deviation

0.0343

Probability for a Range

(b)

From X Value

0.7

To X Value

0.78

Z Value for 0.7

2.3090379

Z Value for 0.78

4.6413994

P(X<=0.7)

0.9895

P(X<=0.78)

1.0000

P(0.7<=X<=0.78)

0.0105

The probability that 90% of the sample percentage have three or more women on the board of directors is nearly 0. Probability for X >

(c)

X Value

0.9

Z Value

8.1399417

P(X>0.9)

0.0000

The probability that 95% of the sample percentage have three or more women on the board of directors is nearly 0. Probability for X > X Value

0.95

Z Value

9.5976676

P(X>0.95)

7.21

0.0000

Because the average of all the possible sample means of size n is equal to the population mean. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxxvii 7.22

The standard error of the sample means becomes smaller as larger sample sizes are taken. This is due to the fact that an extreme observation will have a smaller effect on the mean in a larger sample than in a small sample. Thus, the sample means will tend to be closer to the population mean as the sample size increases.

7.23

As larger sample sizes are taken, the effect of extreme values on the sample mean becomes smaller and smaller. With large enough samples, even though the population is not normally distributed, the sampling distribution of the mean will be approximately normally distributed.

7.24

The population distribution is the distribution of a particular variable of interest, while the sampling distribution represents the distribution of a statistic.

7.25

When the items of interest and the items not of interest are at least 5, the normal distribution can be used to approximate the binomial distribution.

7.26

 X  0.753

X 

 n

0.004 = 0.0008 5

PHStat output: Common Data Mean

0.753

Standard Deviation

0.0008 Probability for a Range

Probability for X <=

From X Value

0.75

To X Value

0.753

X Value

0.74

Z Value

–16.25

Z Value for 0.75

–3.75

1.117E-59

Z Value for 0.753

0

P(X<=0.74)

Probability for X > X Value

0.76

Z Value

8.75

P(X>0.76)

0.0000

P(X<=0.75)

0.0001

P(X<=0.753)

0.5000

P(0.75<=X<=0.753)

0.4999

Find X and Z Given Cum. Pctage. Cumulative Percentage

7.00%

Probability for X<0.74 or X >0.76

Z Value

–1.475791

P(X<0.74 or X >0.76)

X Value

0.751819

0.0000

Copyright ©2024 Pearson Education, Inc.


xxxviii Chapter 7: Sampling Distributions Probability for a Range From X Value

0.74

To X Value

0.75

Z Value for 0.74

–16.25

Z Value for 0.75

–3.75

P(X<=0.74)

0.0000

P(X<=0.75)

0.0001

P(0.74<=X<=0.75)

0.00009

(a) (b) (c) (d) (e)

P(0.75 < X < 0.753) = P(– 3.75 < Z < 0) = 0.5 – 0.00009 = 0.4999 P(0.74 < X < 0.75) = P(– 16.25 < Z < – 3.75) = 0.00009 P( X > 0.76) = P(Z > 8.75) = virtually zero P( X < 0.74) = P(Z < – 16.25) = virtually zero P( X < A) = P(Z < – 1.48) = 0.07 X = 0.753 – 1.48(0.0008) = 0.7518

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxxix 7.27

 X    0.045 = 0.009

X  2.0

n

25

PHStat output: Common Data Mean

2

Standard Deviation

0.009 Probability for a Range

Probability for X <=

From X Value

1.99 2.0

X Value

1.98

To X Value

Z Value

–2.22

Z Value for 1.99

P(X<=1.98)

0.0131

Z Value for 2

Probability for X > X Value

2.01

Z Value

1.11

P(X>2.01)

0.1333

P(X<1.98 or X >2.01)

0.1464

0

P(X<=1.99)

0.1333

P(X<=2)

0.5000

P(1.99<=X<=2)

0.3667

Find X and Z Given Cum. Pctage. Cumulative Percentage

Probability for X<1.98 or X >2.01

–1.11

1.00%

Z Value

–2.3263

X Value

1.9791

Find X and Z Given Cum. Pctage.

(a) (b) (c) (d) (e)

Cumulative Percentage

99.50%

Z Value

2.5758

X Value

2.0232

P(1.99 < X < 2.00) = P(–1.11 < Z < 0) = 0.5 – 0.1333 = 0.3667 P( X < 1.98) = P(Z < –2.22) = 0.0131 P( X > 2.01) = P(Z > 1.11) = 1.0 – 0.8667 = 0.1333 P( X > A) = P( Z > –2.33) = 0.99 A = 2.00 – 2.33(0.009) = 1.9791 P(A < X < B) = P(–2.58 < Z < 2.58) = 0.99 Copyright ©2024 Pearson Education, Inc.


xl Chapter 7: Sampling Distributions A = 2.00 – 2.58(0.009) = 1.9768 7.28

X  4.7

X 

X n

B = 2.00 + 2.58(0.009) = 2.0232

0.40  0.08 5

PHstat output: Common Data Mean Standard Deviation Probability for X > X Value Z Value P(X>4.6)

4.7 0.08

Find X and Z Given Cum. Pctage. Cumulative Percentage Z Value X Value

4.6 –1.25 0.8944

Find X and Z Given Cum. Pctage. Cumulative Percentage Z Value X Value

23.00% –0.738847 4.640892

15.00% –1.036433 4.6170853

Find X and Z Given Cum. Pctage. Cumulative Percentage Z Value X Value

85.00% 1.036433 4.782915

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xli 7.28 cont.

(a) (b) (c)

7.29

P(4.60 < X ) = P(– 1.25 < Z) = 1 – 0.1056 = 0.8944 P(A < X < B) = P(– 1.04 < Z < 1.04) = 0.70 A = 4.70 – 1.04(0.08) = 4.6168 ounces X = 4.70 + 1.04(0.08) = 4.7832 ounces X P( > A) = P(Z > – 0.74) = 0.77 A = 4.70 – 0.74(0.08) = 4.6408

(a)

X

 0.40  0.08 n 25 Partial PHStat output:

X  4.9

X 

Probability for X > X Value Z Value P(X>4.60)

4.6 –3.75 0.9999

P(4.60 < X ) = P(–5 < Z) = 0.9999 (b)

Partial PHStat output: Find X and Z Given Cum. Pctage. Cumulative Percentage 15.00% Z Value –1.0364 X Value 4.8171 Find X and Z Given Cum. Pctage. Cumulative Percentage 85.00% Z Value 1.0364 X Value 4.9829

P(A < X < B) = P(–1.0364 < Z < 1.0364) = 0.70 A = 4.9 – 1.0364(0.08) = 4.8171 ounces X = 4.9 + 1.0364(0.08) = 4.9829 ounces (c)

Partial PHStat output: Find X and Z Given Cum. Pctage. Cumulative Percentage 23.00% Z Value –0.7388 X Value 4.8409

P( X > A) = P(Z > –0.7388) = 0.77A = 4.9 – 0.7388(0.08) = 4.8409

Copyright ©2024 Pearson Education, Inc.


xlii Chapter 7: Sampling Distributions 7.30

 X  21.08

X 

 n

20 =5 4

Excel Output: Mean

21.08

Standard Deviation

5

Probability for X <=

Probability for a Range

X Value

0

From X Value

0

Z Value

-4.216

To X Value

10

P(X<=0)

0.0000

Z Value for 0

-4.216

Z Value for 10

-2.216

P(X<=0)

0.0000

P(X<=10)

0.0133

P(0<=X<=10)

0.0133

Probability for X > X Value

10

Z Value

-2.216

P(X>10)

0.9867

(a) (b) (c) 7.31

P( X < 0) = P (Z < –4.216) = 0.0000 P(0< X < 10) = P(–4.216< Z < –2.216) = 0.0133 P( X > 10) = P (Z > –2.216) = 0.9867

Excel Output: Mean

-10.87

Standard Deviation

10

Probability for X <=

Probability for a Range

X Value

0

From X Value

-10

Z Value

1.087

To X Value

-20

P(X<=0)

0.8615

Z Value for -10

0.087

Z Value for -20

-0.913

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xliii Probability for X > X Value

-5

Z Value

0.587

P(X>-5)

0.2786

(a) (b) (c)

P(X<=-10)

0.5347

P(X<=-20)

0.1806

P(-10<=X<=-20)

0.3540

P( X < 0) = P (Z < 1.087) = 0.8615 P(–20 < X < –10) = P(–0.913 < Z < 0.087) = 0.3540 P( X > –5) = P (Z > 0.587) = 0.2786

Copyright ©2024 Pearson Education, Inc.


xliv Chapter 7: Sampling Distributions 7.31

 X  10.87  X 

cont.

Excel Output:

 n

10 =5 2

Mean

-10.87

Standard Deviation

5

Probability for X <=

Probability for a Range

X Value

0

From X Value

-10

Z Value

2.174

To X Value

-20

P(X<=0)

0.9851

Z Value for -10

0.174

Z Value for -20

-1.826

P(X<=-10)

0.5691

P(X<=-20)

0.0339

P(-10<=X<=-20)

0.5351

Probability for X > X Value

-5

Z Value

1.174

P(X>-5)

0.1202

(d) (e) (f) (g)

P( X < 0) = P (Z < 2.174) = 0.9851 P(–20 < X < –10) = P(–1.826 < Z < 0.174) = 0.5351 P( X > –5) = P (Z > 1.174) = 0.1202 Since the sample mean of returns of a sample of stocks is distributed closer to the population mean than the return of a single stock, the probabilities in (a) and (b) are lower than those in (d) and (e) while the probability in (c) is higher than that in (f).

ab , since the random 2 numbers in the table range from 0 to 9 the mean is 4.5. When n = 2, the frequency distribution of the sample means for the class should be centered around 4.5 and have a shape similar to column B with n = 2 in Figure 7.4 page 232 of text. As the sample size increases the frequency distribution of the sample means should have the shape similar to a normal distribution centered around 4.5.

7.32

Class Project answers will vary. The mean of the uniform distribution is

7.33

Class Project answers will vary. This scenario simulates a binomial distribution with   0.5, the mean is n  10(0.5)  5. The frequency distribution of the entire class should have the shape similar to a normal distribution centered around 5.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xlv 7.34

Class Project answers will vary. Population mean is 1.310 and population standard deviation is 1.13

Depending on class results, one should expect similar results to example 7.5 on page 279 of text. The frequency distributions of the sample means for each sample size should progress from a skewed population toward a bell-shaped distribution as the sample size increases.

7.35

(a)–(b) Class Project answers will vary. (c) Since the population histogram of average credit score is fairly symmetrical, one can expect the class created frequency distributions of the sample means (for each sample size) to be approximately normal for n = 5, n = 15, and n = 30.

Copyright ©2024 Pearson Education, Inc.



Chapter 8

83.53    86.47

X Z

8.2

X Z

8.3

Since the results of only one sample are used to indicate whether something has gone wrong in the production process, the manufacturer can never know with 100% certainty that the specific interval obtained from the sample includes the true population mean. In order to have 100% confidence, the entire population (sample size N) would have to be selected.

8.4

Yes, it is true since 5% of intervals will not include the population mean.

8.5

If all possible samples of the same size n  100 are taken, 95% of them will include the true population mean time spent on Twitter per day. Thus, you are 95 percent confident that this sample is one that does correctly estimate the true mean time spent on Twitter per day.

8.6

(a)

(b)

n

 n

 85  1.96 

6 64

8.1

= 125  2.58 

24 36

114.68    135.32

You would compute the mean first because you need the mean to compute the standard deviation. If you had a sample, you would compute the sample mean. If you had the population mean, you would compute the population standard deviation. If you have a sample, you are computing the sample standard deviation not the population standard deviation needed in Equation 8.1. If you have a population, and have computed the population mean and population standard deviation, you don't need a confidence interval estimate of the population mean since you already have computed it.

8.7

If the population mean time spent on Twitter is 56 minutes a day, the confidence interval estimate stated in Problem 8.5 is correct because it contains the value 56 minutes.

8.8

Equation (8.1) assumes that you know the population standard deviation. Because you are selecting a sample of 100 from the population, you are computing a sample standard deviation, not the population standard deviation.

8.9

(a) (b) (c) (d)

0.02 0.9777    0.9923 n 50 Since the value of 1.0 is not included in the interval, there is reason to believe that the mean is different from 1.0 gallon and the distributor has a right to complain. No. Since  is known and n  50 , from the Central Limit Theorem, we may assume that the sampling distribution of X is approximately normal. The reduced confidence level narrows the width of the confidence interval. X Z

 0.985  2.58 

Copyright ©2024 Pearson Education, Inc. v


vi Chapter 8: Confidence Interval Estimation

0.02 0.9795    0.9905 n 50 Since the value of 1.0 is still not included in the interval, and the distributor does have a right to complain. X Z

 0.985  1.96 

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems vii 8.10

(a)

(b) (c) (d)

X Z

 49,875  1.96 

n Excel Output:

1500 64

49,507.51    50,242.49

Yes, because the confidence interval includes 50,000 hours the manufacturer can support a claim that the bulbs have a mean of 50,000 hours. No. Because  is known and n = 64, from the Central Limit Theorem, you know that the sampling distribution of X is approximately normal.  500 49,752.50    49,997.50 X Z  49,875  1.96  n 64 The confidence interval is narrower, based on the population standard deviation of 500 hours and the confidence interval no longer includes 50,000 so the manufacturer could not state that the LED bulbs have a mean life of 50,000 hours. Excel Output:

Copyright ©2024 Pearson Education, Inc.


viii Chapter 8: Confidence Interval Estimation 8.11

X t

S 20  75  2.0301 n 36

8.12

(a) (b) (c) (d) (e)

df = 9,  = 0.05, t /2 = 2.2622 df = 9,  = 0.01, t /2 = 3.2498 df = 31,  = 0.05, t /2 = 2.0395 df = 64,  = 0.05, t /2 = 1.9977 df = 15,  = 0.1, t /2 = 1.7531

8.13

Set 1: 4.5  2.3646 

8.14

Original data: 5.8571  2.4469 

8.15

(a)

68.2330    81.7670

3.7417 1.3719    7.6281 8 2.4495 Set 2: 4.5  2.3646  2.4522    6.5478 8 The data sets have different confidence interval widths because they have different values for the standard deviation. 6.4660 – 0.1229    11.8371 7 2.1602 Altered data: 4.00  2.4469  2.0022    5.9978 7 The presence of an outlier in the original data increases the value of the sample mean and greatly inflates the sample standard deviation.

PHStat output: Confidence Interval Estimate for the Mean Data Sample Standard Deviation Sample Mean Sample Size Confidence Level

200 1500 100 95%

Intermediate Calculations Standard Error of the Mean 20 Degrees of Freedom 99 t Value 1.9842 Interval Half Width 39.6843 Confidence Interval Interval Lower Limit Interval Upper Limit

1460.32 1539.68

S 200 $ 1460.32    $ 1539.68  1500  1.9842  n 100 You can be 95% confident that the population mean spending for all Amazon Prime member shoppers is somewhere between $1460.32 and $1539.68. X t

(b)

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ix 8.16

8.17

85.46    88.54

X t

(b)

You can be 95% confident that the population mean one-time gift donations is somewhere between $85.46 and $88.54.

(a) (b) (c)

8.18

S 9  87  1.9781 n 133

(a)

S 21.4 187.1580    208.4420  197.8  2.1098  n 18 No, a grade of 200 is in the interval. It is not unusual to have an observed tread wear index of 210, which is outside the 95% confidence interval for the population mean tread wear index, because the standard error of the sample mean  / n is smaller than the standard deviation of the population  of the tread wear index for a single observed treat wear. Hence, the value of a single observed tread wear index varies around the population mean more than a sample mean does. X t

(a)

6.32    7.87 Minitab Output:

(b)

You can be 95% confident that the population mean amount spent for lunch at a fast-food restaurant is between $6.31 and $7.87. That the population distribution is normally distributed. The assumption of normality is not seriously violated and with a sample of 15, the validity of the confidence interval is not seriously impacted.

(c) (d)

Copyright ©2024 Pearson Education, Inc.


x Chapter 8: Confidence Interval Estimation 8.19

(a)

Commuting Time 24.72    26.05 Data Sample Standard Deviation

3.3297

Sample Mean

25.385

Sample Size

100

Confidence Level

95%

Intermediate Calculations Standard Error of the Mean

0.33297

Degrees of Freedom

99

t Value

1.9842

Interval Half Width

0.6607

Confidence Interval

(b) (c) (d)

8.20

(a)

Interval Lower Limit

24.72

Interval Upper Limit

26.05

You can be 95% confident that the population mean commuting time is somewhere between 24.72 minutes and 26.05 minutes. That the population distributions are normally distributed The assumption of normality is not seriously violated with sample sizes of 30. The validity of the confidence interval is not seriously impacted. For First and Second Quarter ads: 5.52    6.06 For Halftime and Second half ads: 5.26    5.80 Excel Output: BEFORE HALFTIME

HALFTIME AND AFTERWARDS

Data

Data

Sample Standard Deviation

0.702028

Sample Standard Deviation

0.709263

Sample Mean

5.789286

Sample Mean

5.534483

Sample Size

28

Sample Size

Copyright ©2024 Pearson Education, Inc.

29


Solutions to End-of-Section and Chapter Review Problems xi Confidence Level

95%

Confidence Level

Intermediate Calculations Standard Error of the Mean

Intermediate Calculations

0.132671

Degrees of Freedom

27

0.131707

Degrees of Freedom

28

2.0518

t Value

2.0484

Interval Half Width

0.2722

Interval Half Width

0.2698

Confidence Interval

Interval Lower Limit

5.52

Interval Lower Limit

5.26

Interval Upper Limit

6.06

Interval Upper Limit

5.80

(b)

(c) (d) (e)

8.21

Standard Error of the Mean

t Value

Confidence Interval

8.20 cont.

95%

You are 95% confident that the mean rating for first and second quarter ads is between 5.52 and 6.06. You are 95% confident that the mean rating for halftime and second half ads is between 5.26 and 5.80. The confidence intervals for the two groups of ads are similar. You need to assume that the distributions of the rating for the two groups of ads are normally distributed. The distribution of each group of ads appears approximately normally distributed.

Excel Output: One-Year CD

Five-Year CD Data

Data

Sample Standard Deviation

0.157

Sample Standard Deviation

0.2762

Sample Mean

0.1761

Sample Mean

0.3803

Sample Size

36

Confidence Level

95%

Intermediate Calculations Standard Error of the Mean Degrees of Freedom

Sample Size

36

Confidence Level

95%

Intermediate Calculations

0.026167 35

Standard Error of the Mean Degrees of Freedom

Copyright ©2024 Pearson Education, Inc.

0.046033 35


xii Chapter 8: Confidence Interval Estimation t Value

2.0301

t Value

2.0301

Interval Half Width

0.0531

Interval Half Width

0.0935

Confidence Interval Interval Lower Limit

0.12

Interval Lower Limit

0.29

Interval Upper Limit

0.23

Interval Upper Limit

0.47

(a) (b) (c)

One Year CD 0.12    0.23 Five Year CD 0.29    0.47 The mean yield for a one-year CD is somewhere between 0.12 and 0.23 with 95% confidence and the mean yield for a five-year CD is somewhere between 0.29 and 0.47 with 95% confidence. S 41.9261 31.12    54.96  43.04  2.0096  n 50 The population distribution needs to be normally distribution. X t

Normal Probability Plot 180 160 140 120

Days

(a) (b) (c)

8.22

Confidence Interval

100 80 60 40 20 0

-3

-2

-1

0

1

2

3

Z Value

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xiii 8.22 cont.

(c) Box-and-whisker Plot

Days

0

(d)

8.23

(a)

50

100

150

Both the normal probability plot and the boxplot suggest that the distribution is skewed to the right. Even though the population distribution is not normally distributed, with a sample of 50, the t distribution can still be used due to the Central Limit Theorem. X t

S 87.3651 1691.78    1757.02  1724.4  2.0452  n 30

Confidence Interval Estimate for the Mean

Data Sample Standard Deviation

87.3651

Sample Mean

1724.4

Sample Size

30

Confidence Level

95%

Intermediate Calculations Standard Error of the Mean Degrees of Freedom

15.950612 29

t Value

2.0452

Interval Half Width

32.6227

Confidence Interval Copyright ©2024 Pearson Education, Inc.


xiv Chapter 8: Confidence Interval Estimation

(b)

Interval Lower Limit

1691.78

Interval Upper Limit

1757.02

The population distribution needs to be normally distributed.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xv 8.23 cont.

(c)

The normal probability plot indicates that the population distribution is normally distributed.

Normal Probability Plot

Force

2000 1800 1600 1400 1200 1000 800 600 400 200 0 -3

8.24

-2

-1

0 Z Value

1

2

3

Excel Output: Confidence Interval Estimate for the Mean

Data Sample Standard Deviation Sample Mean

14.1397 59.30714286

Sample Size

14

Confidence Level

95%

Intermediate Calculations Standard Error of the Mean

3.778993782

Degrees of Freedom

13

t Value

2.1604

Interval Half Width

8.1640

Confidence Interval Interval Lower Limit

51.14 Copyright ©2024 Pearson Education, Inc.


xvi Chapter 8: Confidence Interval Estimation Interval Upper Limit (a)

67.47

51.14    67.47

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xvii 8.24 cont.

8.25

(b)

The population distribution is normally distributed.

(c)

The boxplot appears to be left skewed, so the validity of the confidence interval is questionable.

(a) (b)

S 0.0017 –0.000566    0.000106  0.00023  1.9842  n 100 The population distribution needs to be normally distributed. However, with a sample of 100, the t distribution can still be used as a result of the Central Limit Theorem even if the population distribution is not normal. X t

Copyright ©2024 Pearson Education, Inc.


xviii Chapter 8: Confidence Interval Estimation 8.25 cont.

(c) Normal Probability Plot 0.006 0.005 0.004

Error

0.003 0.002 0.001 0 -0.001 -3

-2

-1

0

1

2

0.004

0.006

3

-0.002 -0.003 -0.004

Z Value

Box-and-whisker Plot

Error

-0.006

8.27

-0.002

0

0.002

Both the normal probability plot and the boxplot suggest that the distribution is skewed to the right. We are 95% confident that the mean difference between the actual length of the steel part and the specified length of the steel part is between –0.000566 and 0.000106 inch , which is narrower than the plus or minus 0.005 inch requirement. The steel mill is doing a good job at meeting the requirement. This is consistent with the finding in Problem 2.43.

(d)

8.26 p 

-0.004

X 50 = 0.25 p  Z   n 200

p

p(1– p) 0.25(0.75)  0.25  1.96 n 200 0.19    0.31

X 20 pZ   0.05 n 400 0.0219    0.0781

p(1  p) 0.05(0.95)  0.05  2.58 n 400

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xix 8.28

(a)

(b)

p(1 – p) 0.27(1  0.27) X 135 pZ  0.27  2.5758 = 0.27  n 500 n 500 0.22    0.32 The manager in charge of promotional programs concerning residential customers can infer that the proportion of households that would purchase a new cellphone if it were made available at a substantially reduced installation cost is between 0.22 and 0.32 with a 99% level of confidence.

(a)

Excel Output: Number of successes  400(0.73)  292

p

8.29

Confidence Interval Estimate for the Proportion

Data Sample Size

400

Number of Successes

292

Confidence Level

95%

Intermediate Calculations Sample Proportion

0.73

Z Value

-1.9600

Standard Error of the Proportion

0.0222

Interval Half Width

0.0435

Confidence Interval

Copyright ©2024 Pearson Education, Inc.


xx Chapter 8: Confidence Interval Estimation Interval Lower Limit

0.6865

Interval Upper Limit

0.7735

p  0.73 p  Z

(b)

p 1  p  n

 0.73  1.96

0.73 1  0.73 400

0.6865    0.7735 The 95% confidence interval for the proportion of adults who responded somewhat agree or strongly agree that flexibility in work scheduling increases productivity is somewhere between 68.65% and 77.35%. One can infer that a large proportion of U.S. adults want flexibility in work scheduling.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxi 8.30

(a)

Excel Output: Number of successes  2000(0.798)  1596 Confidence Interval Estimate for the Proportion Data Sample Size

2000

Number of Successes

1596

Confidence Level

95%

Intermediate Calculations Sample Proportion

0.798

Z Value

-1.9600

Standard Error of the Proportion

0.0090

Interval Half Width

0.0176

Confidence Interval Interval Lower Limit

0.7804

Interval Upper Limit

0.8156 p 1  p 

0.798 1  0.798 

0.7804    0.8156 n 2000 The 95% confidence interval estimate for the population proportion of U.S. Internet shoppers aged 18 and above who said they shopped at Amazon because of free shipping is between 78.04% and 81.56%. p  0.798

(b)

pZ

 0.798  1.96

Excel Output: Number of successes  2000(0.689)  1378 Data Sample Size

2000

Number of Successes

1378

Confidence Level

95%

Intermediate Calculations

Copyright ©2024 Pearson Education, Inc.


xxii Chapter 8: Confidence Interval Estimation Sample Proportion

0.689

Z Value

-1.9600

Standard Error of the Proportion

0.0104

Interval Half Width

0.0203

Confidence Interval Interval Lower Limit

0.6687

Interval Upper Limit

0.7093

p 1  p 

0.689 1  0.689 

0.6687    0.7093 n 2000 The 95% confidence interval estimate for the population proportion of U.S. Internet shoppers aged 18 and above who said they shopped at Amazon because of broad selection is between 66.87% and 70.93%. p  0.689

(c) 8.31

(a)

pZ

 0.689  1.96

A large proportion of Amazon shoppers purchase from Amazon because of free shipping and because of its broad selection. Excel Output: Number of successes  1000(0.85)  850 Confidence Interval Estimate for the Proportion Data Sample Size

1000

Number of Successes

850

Confidence Level

95%

Intermediate Calculations Sample Proportion

0.85

Z Value

-1.9600

Standard Error of the Proportion

0.0113

Interval Half Width

0.0221

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxiii Confidence Interval Interval Lower Limit

0.8279

Interval Upper Limit

0.8721

p 1  p 

p  0.85 p  Z

 0.85  1.96

0.85 1  0.85 

n 1000 0.8279    0.8721 The 95% confidence interval estimate for the population proportion of U.S. adults who now say that they go online on a daily basis is between 82.79% and 87.21%.

(b)

Excel Output: Number of successes  1000(0.07)  70 Data Sample Size

1000

Number of Successes

70

Confidence Level

95%

Intermediate Calculations Sample Proportion

0.07

Z Value

-1.9600

Standard Error of the Proportion

0.0081

Interval Half Width

0.0158

Confidence Interval Interval Lower Limit

0.0542

Interval Upper Limit

0.0858

p  0.07 p  Z

p 1  p 

 0.07  1.96

0.07 1  0.07 

n 1000 0.0542    0.0858 The 95% confidence interval estimate for the population proportion of U.S. adults who say that they do not use the Internet at all is between 5.42% and 8.58%.

(c)

A large proportion of U.S. adults say they use the Internet on a daily basis, while a very small proportion say that they do not use the Internet at all. Copyright ©2024 Pearson Education, Inc.


xxiv Chapter 8: Confidence Interval Estimation 8.32

(a)

Excel Output: Number of successes  500(0.40)  200 Confidence Interval Estimate for the Proportion Data Sample Size

500

Number of Successes

200

Confidence Level

95%

Intermediate Calculations Sample Proportion

0.4

Z Value

-1.9600

Standard Error of the Proportion

0.0219

Interval Half Width

0.0429

Confidence Interval Interval Lower Limit

0.3571

Interval Upper Limit

0.4429

p  0.40 p  Z

p 1  p 

 0.40  1.96

0.40 1  0.40 

n 500 0.3571    0.4429 The 95% confidence interval estimate for the population proportion of Americans who drink their coffee at coffee shops drink at Starbucks is between 35.71% and 44.29%.

(b)

Excel Output: Number of successes  500(0.26)  130 Data Sample Size

500

Number of Successes

130

Confidence Level

95%

Intermediate Calculations

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxv Sample Proportion

0.26

Z Value

-1.9600

Standard Error of the Proportion

0.0196

Interval Half Width

0.0384

Confidence Interval Interval Lower Limit

0.2216

Interval Upper Limit

0.2984

p  0.26 p  Z

p 1  p 

 0.26  1.96

0.26 1  0.26 

n 500 0.2216    0.2984 The 95% confidence interval estimate for the population proportion of Americans who drink their coffee at coffee shops drink at Dunkin is between 22.16% and 29.84%.

(c) 8.33

(a)

A large proportion of Americans who drink their coffee at coffee shops drink at Starbucks or at Dunkin. Excel Output: Number of successes = 1,358 Confidence Interval Estimate for the Proportion Data Sample Size

3725

Number of Successes

1358

Confidence Level

95%

Intermediate Calculations Sample Proportion

0.364563758

Z Value

-1.9600

Standard Error of the Proportion

0.0079

Interval Half Width

0.0155

Copyright ©2024 Pearson Education, Inc.


xxvi Chapter 8: Confidence Interval Estimation Confidence Interval Interval Lower Limit

0.3491

Interval Upper Limit

0.3800

p 1  p  0.3646 1  0.3646  X 1358  0.3646  1.96   0.3646 p  Z n 3,725 n 3725 0.3491    0.3800 The 95% confidence interval estimate for population proportion of customers who had paperless billing and who churned in the last month s is between 34.91% and 38.00%. p

(b)

Excel Output: Number of successes = 398 Data Sample Size

1792

Number of Successes

398

Confidence Level

95%

Intermediate Calculations Sample Proportion

0.222098214

Z Value

-1.9600

Standard Error of the Proportion

0.0098

Interval Half Width

0.0192

Confidence Interval Interval Lower Limit

0.2029

Interval Upper Limit

0.2413

p 1  p  0.22211  0.2221 X 398  0.2221  1.96   0.2221 p  Z n 1,792 n 1792 0.2029    0.2413 The 95% confidence interval estimate for the population proportion of customers who did not have paperless billing and who churned in the last month s is between 20.29% and 24.13%. p

(c)

A small proportion of telecom customers churned last month. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxvii 8.34

n

Z 2 2 1.962 152 = 34.57  e2 52

Use n = 35

8.35

n

Z 2 2 2.582  902   134.7921 e2 202

Use n  135

8.36

n

Z 2 (1 –  ) 2.582 (0.5)(0.5) = 1,040.06  e2 (0.04)2

Use n = 1,041

8.37

n

Z 2 (1 –  ) 1.962 (0.38)(0.62)   1,005.6455 e2 (0.03)2

8.38

(a) (b)

Z 2 2 1.962  4002 = 245.86  e2 502 Z 2 2 1.962  4002 = 983.41 n 2  e 252 n

Z 2 2 1.962  (0.025)2   196 e2 (0.0035)2

8.39

n

8.40

Excel Output:

Use n  1006

Use n = 246 Use n = 984

Use n  196

n

Z 2 2 1.962 15002   54.0225 e2 4002

Use n = 55

8.41

n

Z 2 2 1.962  (0.045)2   34.5744 e2 (0.015)2

Use n  35

8.42

(a) (b)

Z 2 2 2.57582  202 = 106.1583  e2 52 Z 2 2 1.962  202 n 2  = 61.4633 e 52 n

Use n = 107 Use n = 62

Copyright ©2024 Pearson Education, Inc.


xxviii Chapter 8: Confidence Interval Estimation 8.43

(a) (b)

Z 2 2 1.6452  302   152.2 e2 42 Z 2 2 2.57582  302 n 2   373.2 e 42

Use n  153

n

Use n  374

8.44

Note: All the answers are computed using PHStat. Answers computed otherwise may be slightly different due to rounding. Z 2 2 1.962  22 (a) = 245.85 Use n = 246 n 2  e 0.252 Z 2 2 1.962  2.52 (b) = 384.15 Use n = 385 n 2  e 0.252 Z 2 2 1.962  3.02 (c) = 553.17 Use n = 554 n 2  e 0.252 (d) When there is more variability in the population, a larger sample is needed to accurately estimate the mean.

8.45

(a)

Excel Output: Data Estimate of True Proportion

0.36

Sampling Error

0.04

Confidence Level

95%

Intermediate Calculations Z Value

-1.9600

Calculated Sample Size

553.1701

Result Sample Size Needed

554.0000

Z 2 (1   ) 1.962  (0.36)(1  0.36)   553.19 e2 0.042 Excel Output: n

(b)

Data Estimate of True Proportion

0.36

Sampling Error

0.04

Copyright ©2024 Pearson Education, Inc.

Use n  554


Solutions to End-of-Section and Chapter Review Problems xxix Confidence Level

99%

Intermediate Calculations Z Value

-2.5758

Calculated Sample Size

955.4251

Result Sample Size Needed

956.0000

Z 2 (1   ) 2.57582  (0.36)(1  0.36)   955.4 e2 0.042 Excel Output: n

8.45 cont.

(c)

Use n  956

Data Estimate of True Proportion

0.36

Sampling Error

0.02

Confidence Level

95%

Intermediate Calculations Z Value

-1.9600

Calculated Sample Size

2212.6803

Result Sample Size Needed

2213.0000

Z 2 (1   ) 1.962  (0.36)(1  0.36)   2212.76 e2 0.022 Excel Output: n

(d)

Data Estimate of True Proportion

0.36

Sampling Error

0.02

Copyright ©2024 Pearson Education, Inc.

Use n  2213


xxx Chapter 8: Confidence Interval Estimation Confidence Level

99%

Intermediate Calculations Z Value

-2.5758

Calculated Sample Size

3821.7004

Result Sample Size Needed

3822.0000

Z 2 (1   ) 2.57582  (0.36)(1  0.36) Use n  3822   3821.6 e2 0.022 The higher the level of confidence desired, the larger is the sample size required. The smaller the sampling error desired, the larger is the sample size required. n

(e)

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxxi 8.46

(a)

Excel Output: pricing expectations of potential targets Confidence Interval Estimate for the Proportion Data Sample Size

229

Number of Successes

167

Confidence Level

95%

Intermediate Calculations Sample Proportion

0.729257642

Z Value

-1.9600

Standard Error of the Proportion

0.0294

Interval Half Width

0.0576

Confidence Interval Interval Lower Limit

0.6717

Interval Upper Limit

0.7868

X 169   0.729 p  Z n 229 0.6717    0.7868

p

(b)

p 1  p  n

 0.729  1.96

Excel Output: culture/integration of personnel Data Sample Size

229

Number of Successes

128

Confidence Level

95%

Intermediate Calculations Sample Proportion

0.558951965

Copyright ©2024 Pearson Education, Inc.

0.729 1  0.729  229


xxxii Chapter 8: Confidence Interval Estimation Z Value

-1.9600

Standard Error of the Proportion

0.0328

Interval Half Width

0.0643

Confidence Interval Interval Lower Limit

0.4946

Interval Upper Limit

0.6233

X 128   0.559 p  Z n 229 0.4946    0.6233

p

p 1  p  n

 0.559  1.96

Copyright ©2024 Pearson Education, Inc.

0.559 1  0.559  229


Solutions to End-of-Section and Chapter Review Problems xxxiii 8.46 cont.

(c)

Excel Output: technology integration Data Sample Size

229

Number of Successes

48

Confidence Level

95%

Intermediate Calculations Sample Proportion

0.209606987

Z Value

-1.9600

Standard Error of the Proportion

0.0269

Interval Half Width

0.0527

Confidence Interval Interval Lower Limit

0.1569

Interval Upper Limit

0.2623

X 48   0.210 p  Z n 229 0.1569    0.2623

p

(d)

p 1  p  n

 0.210  1.96

Excel Output: pricing expectations of potential targets (a) Data Estimate of True Proportion

0.729257642

Sampling Error

0.02

Confidence Level

95%

Intermediate Calculations Z Value

-1.9600

Copyright ©2024 Pearson Education, Inc.

0.210 1  0.210  229


xxxiv Chapter 8: Confidence Interval Estimation Calculated Sample Size

1896.1530

Result Sample Size Needed

n

1897.0000

Z 2 (1   ) 1.962  (0.729)(1  0.729)   1869.2 e2 0.022

Copyright ©2024 Pearson Education, Inc.

Use n  1,897


Solutions to End-of-Section and Chapter Review Problems xxxv 8.46 cont.

(d)

Excel Output: culture/integration of personnel (b) Data Estimate of True Proportion

0.558951965

Sampling Error

0.02

Confidence Level

95%

Intermediate Calculations Z Value

-1.9600

Calculated Sample Size

2367.5359

Result Sample Size Needed

2368.0000

Z 2 (1   ) 2.57582  (0.559)(1  0.559)   2367.5 e2 0.022 Excel Output: technology integration (c) Data n

Estimate of True Proportion

Use n  2,368

0.209606987

Sampling Error

0.02

Confidence Level

95%

Intermediate Calculations Z Value

-1.9600

Calculated Sample Size

1591.0544

Result Sample Size Needed

n

1592.0000

Z 2 (1   ) 1.962  (0.210)(1  0.210)   1,591.1 e2 0.022 Copyright ©2024 Pearson Education, Inc.

Use n  1,592


xxxvi Chapter 8: Confidence Interval Estimation

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxxvii 8.47

(a)

Excel Output: Data Sample Size

400

Number of Successes

196

Confidence Level

95%

Intermediate Calculations Sample Proportion

0.49

Z Value

-1.9600

Standard Error of the Proportion

0.0250

Interval Half Width

0.0490

Confidence Interval Interval Lower Limit

0.4410

Interval Upper Limit

0.5390

X 196   0.49 p  Z n 400 0.4410    0.5390 p

p 1  p  n

 0.49  1.96

0.49 1  0.49  400

(b)

You are 95% confident that the population proportion of nonprofit professionals that indicate ensuring employees are properly trained an serving their mission are their most important goals for the coming year is somewhere between 44.10%nd 53.90%.

(c)

Excel Output: Data Estimate of True Proportion

0.49

Sampling Error

0.01

Confidence Level

95%

Intermediate Calculations Copyright ©2024 Pearson Education, Inc.


xxxviii Chapter 8: Confidence Interval Estimation Z Value

-1.9600

Calculated Sample Size

9599.8056

Result Sample Size Needed

n 8.48

(a)

9600.0000

Z 2 (1   ) 1.962  (0.49)(1  0.49)   9,599.8 e2 0.012

Use n  9,600

If you conducted a follow-up study, you would use   0.32 in the sample size formula because it is based on past information on the proportion.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxxix 8.48 cont.

(b)

Excel Output: Data Estimate of True Proportion

0.32

Sampling Error

0.03

Confidence Level

95%

Intermediate Calculations Z Value

-1.9600

Calculated Sample Size

928.7794

Result Sample Size Needed

n 8.49

(a)

929.0000

Z 2 (1   ) 1.962  (0.32)(1  0.32)   928.8 e2 0.032

Use n  929

Excel Output: Data Estimate of True Proportion

0.54

Sampling Error

0.03

Confidence Level

99%

Intermediate Calculations Z Value

-2.5758

Calculated Sample Size

1831.2315

Result Sample Size Needed

n (b)

1832.0000

Z 2 (1   ) 1.962  (0.54)(1  0.54)   1,831.2 e2 0.032

Excel Output: Copyright ©2024 Pearson Education, Inc.

Use n  1,832


xl Chapter 8: Confidence Interval Estimation Data Estimate of True Proportion

0.54

Sampling Error

0.05

Confidence Level

99%

Intermediate Calculations Z Value

-2.5758

Calculated Sample Size

659.2433

Result Sample Size Needed

n

8.50

8.51 8.52

8.53

660.0000

Z 2 (1   ) 1.962  (0.54)(1  0.54)   659.2 e2 0.052

Use n  660

(c) A smaller sampling error requires a larger sample size. The only way to have 100% confidence is to obtain the parameter of interest, rather than a sample statistic. From another perspective, the range of the normal and t distribution is infinite, so a Z or t value that contains 100% of the area cannot be obtained. The t distribution is used for obtaining a confidence interval for the mean when  is unknown. If the confidence level is increased, a greater area under the normal or t distribution needs to be included. This leads to an increased value of Z or t, and thus a wider interval. The term  1    reaches its largest value when the population proportion is at 0.5. Hence, the sample size n 

Z 2 1   

needed to determine the proportion is smaller when the population e2 proportion is 0.20 than when the population proportion is 0.50. 8.54

(a)

PC/laptop Excel Output:

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xli

p  0.84

pZ

p 1  p 

0.8173    0.8627

n

 0.84  1.96

0.84 1  0.84  1000

Copyright ©2024 Pearson Education, Inc.


xlii Chapter 8: Confidence Interval Estimation 8.54 cont.

(a)

Smartphone Excel Output:

p  0.91 p  Z

p 1  p 

0.8923    0.9277

n

 0.91  1.96

0.911  0.91 1000

Tablet Excel Output:

p  0.5

pZ

p 1  p 

0.469    0.5310

n

 0.5  1.96

0.5 1  0.5  1000

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xliii 8.54 cont.

(a)

Smart Watch Excel Output:

p(1  p) 0.1(1  0.1)  0.1  n 100 0.0814    0.1186 Most adults have a PC/laptop and a smartphone. Some adults have a tablet computer and very few have a smart watch. p  0.1 p  Z

(b)

8.55

(a)

Digital Coupons Excel Output:

p(1  p) 0.49(1  0.49)  0.49  n 731 0.4535    0.5260 p  0.49

pZ

Copyright ©2024 Pearson Education, Inc.


xliv Chapter 8: Confidence Interval Estimation 8.55 cont.

(a)

Look up recipes Excel Output:

p(1  p) 0.485(1  0.485)  0.485  n 731 0.4494    0.5219 p  0.485 p  Z

Read product reviews Excel Output:

p(1  p) 0.32(1  0.32)  0.32  n 731 0.2863    0.3539 p  0.32

pZ

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xlv 8.55 cont.

(a)

Locate in-store items Excel Output:

p(1  p) 0.21(1  0.21)  0.21  n 731 0.1811    0.2402 About half of smartphone owners use their phone to access digital coupons or look up recipes while shopping in a grocery store. Fewer smartphone owners use their phone to read product reviews or locate items in the store while shopping in a grocery store. p  0.21 p  Z

(b)

8.56

Note: All the answers are computed using PHStat. Answers computed otherwise may be slightly different due to rounding. (a) PHStat output: Confidence Interval Estimate for the Mean Data Sample Standard Deviation Sample Mean Sample Size Confidence Level

3.5 51 40 95%

Intermediate Calculations Standard Error of the Mean 0.553398591 Degrees of Freedom 39 t Value 2.0227 Interval Half Width 1.1194 Confidence Interval Interval Lower Limit Interval Upper Limit 49.88    52.12

49.88 52.12

Copyright ©2024 Pearson Education, Inc.


xlvi Chapter 8: Confidence Interval Estimation 8.56 cont.

(b)

PHStat output: Confidence Interval Estimate for the Proportion Data Sample Size Number of Successes Confidence Level

(c) (d) (e)

8.57

(a)

40 32 95%

Intermediate Calculations Sample Proportion Z Value Standard Error of the Proportion Interval Half Width

0.8000 -1.9600 0.0632 0.1240

Confidence Interval Interval Lower Limit Interval Upper Limit

0.6760 0.9240

0.6760    0.9240 Z 2   2 1.962  52 = 24.01 Use n = 25 n  e2 22 Z 2    (1 –  ) 1.962  (0.5)  (0.5) = 266.7680 Use n = 267 n  e2 (0.06)2 If a single sample were to be selected for both purposes, the larger of the two sample sizes (n = 267) should be used. Excel Output:

X t

S 8  42  2.680  n 50

38.97    45.03

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xlvii 8.57 cont.

(b)

Excel Output:

p(1  p) 0.26(1  0.26)  0.26  n 50 0.1384    0.3816 p  0.26

8.58

(a)

pZ

PHStat output: Confidence Interval Estimate for the Mean Data Sample Standard Deviation Sample Mean Sample Size Confidence Level

7.3 6.2 25 95%

Intermediate Calculations Standard Error of the Mean 1.46 Degrees of Freedom 24 t Value 2.0639 Interval Half Width 3.0133 Confidence Interval Interval Lower Limit Interval Upper Limit

3.19    9.21

3.19 9.21

Copyright ©2024 Pearson Education, Inc.


xlviii Chapter 8: Confidence Interval Estimation 8.58 cont.

(b)

PHStat output: Confidence Interval Estimate for the Proportion Data Sample Size Number of Successes Confidence Level

(c) (d) (e)

8.59

25 13 95%

Intermediate Calculations Sample Proportion Z Value Standard Error of the Proportion Interval Half Width

0.52 -1.9600 0.0999 0.1958

Confidence Interval Interval Lower Limit Interval Upper Limit

0.3242 0.7158

0.3241    0.7158 Z 2   2 1.962  82 = 109.2682 Use n = 110 n  e2 1.52 Z 2    (1 –  ) 1.6452  (0.5)  (0.5) = 120.268 Use n = 121 n  e2 (0.075)2 If a single sample were to be selected for both purposes, the larger of the two sample sizes (n = 121) should be used.

Note: All the answers are computed using PHStat. Answers computed otherwise may be slightly different due to rounding. (a) PHStat output: Confidence Interval Estimate for the Mean Data Sample Standard Deviation Sample Mean Sample Size Confidence Level

1000 12500 100 95%

Intermediate Calculations Standard Error of the Mean 100 Degrees of Freedom 99 t Value 1.9842 Interval Half Width 198.4217 Confidence Interval Interval Lower Limit Interval Upper Limit

12301.58 12698.42

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xlix

8.59 cont.

(b)

$12,301.58    $12,698.42 PHStat output: Confidence Interval Estimate for the Proportion Data Sample Size Number of Successes Confidence Level

(c) (d)

8.60

(a) (b) (c)

8.61

(a) (b)

(c) (d)

100 30 95%

Intermediate Calculations Sample Proportion Z Value Standard Error of the Proportion Interval Half Width

0.3000 -1.9600 0.0458 0.0898

Confidence Interval Interval Lower Limit Interval Upper Limit

0.2102 0.3898

0.2102    0.3898 Z 2   2 2.582 10002 = 106.1583 Use n = 107 n  e2 2502 Z 2    (1 –  ) 1.6452  (0.3)  (1  0.3) = 280.5749 Use n = 281 n  e2 (0.045)2 If a single sample were to be selected for both purposes, the larger of the two sample sizes (n = 281) should be used.

p(1 – p) 0.31(0.69)  0.31  1.645  n 200 S 2 X t  3.5  1.9720  n 200 S 3000 X t  18000  1.9720  n 200

0.2562    0.3638

S $9.22  $21.34  1.9949  n 70 p(1 – p) 0.3714(0.6286) pZ  0.3714  1.645  n 70 0.2764    0.4664 Z 2 2 1.962 102 n 2  = 170.74 e 1.52 Z 2    (1 –  ) 1.6452  (0.5)  (0.5) = 334.08 n  e2 (0.045)2

$19.14    $23.54

pZ

X t

Copyright ©2024 Pearson Education, Inc.

3.22    3.78 $17,581.68    $18,418.32

Use n = 171 Use n = 335


l Chapter 8: Confidence Interval Estimation (e)

8.62

(a) (b) (c) (d) (e)

8.63

(a)

(b) (c)

8.64

If a single sample were to be selected for both purposes, the larger of the two sample sizes (n = 335) should be used. S $7.26 $36.66    $40.42  $38.54  2.0010  n 60 p(1 – p) 0.30(0.70) pZ  0.30  1.645  0.2027    0.3973 n 60 Z 2 2 1.962  82 = 109.27 Use n = 110 n 2  e 1.52 Z 2    (1 –  ) 1.6452  (0.5)  (0.5) = 422.82 Use n = 423 n  e2 (0.04)2 If a single sample were to be selected for both purposes, the larger of the two sample sizes (n = 423) should be used. X t

Z 2    (1   ) 1.962  (0.5)  (0.5) = 384.16 Use n = 385  e2 (0.05)2 If we assume that the population proportion is only 0.50, then a sample of 385 would be required. If the population proportion is 0.90, the sample size required is cut to 138. p(1  p) 0.84(0.16) pZ  0.84  1.96  n 50 0.7384    0.9416 The representative can be 95% confidence that the actual proportion of bags that will do the job is between 74.5% and 93.5%. He/she can accordingly perform a cost-benefit analysis to decide if he/she wants to sell the Ice Melt product. n

(a) Confidence Interval Estimate for the Proportion Data Sample Size Number of Successes Confidence Level

90 51 95%

Intermediate Calculations Sample Proportion 0.566666667 Z Value -1.9600 Standard Error of the Proportion 0.0522 Interval Half Width 0.1024 Confidence Interval Interval Lower Limit Interval Upper Limit

0.4643 0.6690

p(1  p) 0.5667(1  0.5667)  0.5667  1.96  n 90 0.4643    0.6690 pZ

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems li 8.64 cont.

(b) Confidence Interval Estimate for the Mean Data Sample Standard Deviation Sample Mean Sample Size Confidence Level

1103.6491 563.38 51 95%

Intermediate Calculations Standard Error of the Mean 154.5417855 Degrees of Freedom 50 t Value 2.0086 Interval Half Width 310.4063 Confidence Interval Interval Lower Limit Interval Upper Limit

X t

8.65

(a) (b) (c)

8.66

(a)

252.97 873.79

S  1103.6491   563.38  2.0086   $252.97    $873.79 n 51  

S 0.1058 5.46    5.54  5.5014  2.6800  n 50 Since 5.5 grams is within the 99% confidence interval, the company can claim that the mean weight of tea in a bag is 5.5 grams with a 99% level of confidence. The assumption is valid as the weight of the tea bags is approximately normally distributed. X t

MiniTab Output

Copyright ©2024 Pearson Education, Inc.


lii Chapter 8: Confidence Interval Estimation (a)

(c)

13.4001    16.559 With 95% confidence, the population mean answer time is somewhere between 13.40 and 16.56 seconds. The assumption is valid as the answer time is approximately normally distributed.

(a)

X t

(b)

8.67

(b)

S 34.713 3120.66    3127.77  3124.2147  1.9665  n 368 S 46.7443 3698.98<  3709.10 X t  3704.0424  1.9672  n 330

(c)

Normal Probability Plot 3300 3250

Boston

8.66 cont.

3200 3150 3100 3050 3000 -4

-3

-2

-1

0

1

2

Z Value

Copyright ©2024 Pearson Education, Inc.

3

4


Solutions to End-of-Section and Chapter Review Problems liii 8.67 cont.

(c)

Normal Probability Plot 3900 3850

Vermont

3800 3750 3700 3650 3600 3550 -4

-3

-2

-1

0

1

2

3

4

Z Value

(d)

(a) (b)

S 0.1424 0.2425    0.2856  0.2641  1.9741  n 170 S 0.1227 0.1975    0.2385 X t  0.218  1.9772  n 140

X t

(c)

Normal Probability Plot 0.9 0.8 0.7

Vermont

8.68

The weight for Boston shingles is slightly skewed to the right while the weight for Vermont shingles appears to be slightly skewed to the left. Since the two confidence intervals do not overlap, the mean weight of Vermont shingles is greater than the mean weight of Boston shingles.

0.6 0.5 0.4 0.3 0.2 0.1 0 -3

-2

-1

0

1

2

Z Value

Copyright ©2024 Pearson Education, Inc.

3


liv Chapter 8: Confidence Interval Estimation 8.68 cont.

(c)

Normal Probability Plot 1.2

Boston

1 0.8 0.6 0.4 0.2 0 -3

-2

-1

0

1

2

3

Z Value

(d)

8.69

The amount of granule loss for both brands are skewed to the right but the sample sizes are large enough so the violation of the normality assumption is not critical. Because the two confidence intervals do not overlap, you can conclude that the mean granule loss of Boston shingles is higher than that of Vermont shingles

Report Writing Exercise answers will vary. An example of a report would be as follows: One can conclude with 95% confidence that the mean time for a human agent to answer a call to the financial service center is between 13.401 and 16.559 seconds. The validity of this confidence interval estimate depends on the assumption that the processing time is normally distributed. From the box plot below the answer time appears approximately symmetric so the validity of the confidence interval is not in serious doubt.

Copyright ©2024 Pearson Education, Inc.


Chapter 9

9.1

Decision rule: Reject H0 if Z STAT < –1.65 or Z STAT > +1.65. Decision: Since ZSTAT = –1.76 is less than the lower critical value of –1.65, reject H0.

9.2

Decision rule: Reject H0 if Z STAT < –1.96 or Z STAT > +1.96. Decision: Since ZSTAT = +2.21 is greater than the upper critical value of + 1.96, reject H0.

9.3

Decision rule: Reject H0 if Z STAT < –2.58 or Z STAT > +2.58.

9.4

Decision rule: Reject H0 if Z STAT < –2.58 or Z STAT > +2.58.

9.5

Decision rule: Reject H0 if Z STAT < –2.58 or Z STAT > +2.58 Decision: Since ZSTAT = –2.51 in between the two critical values, do not reject H0.

9.6

For two-tail hypothesis test where ZSTAT = +2.00, p-value = 2(1 .9772) = 0.0456

9.7

Since the p-value of 0.0456 is less than the 0.05 level of significance, the statistical decision is to reject the null hypothesis.

9.8

For two-tail hypothesis test where ZSTAT = –1.38, p-value = 0.0838 + (1 – 0.9162) = 0.1676

9.9

 is the probability of incorrectly convicting the defendant when he is innocent.  is the probability of incorrectly failing to convict the defendant when he is guilty.

9.10

Under the French judicial system, unlike the United States, the null hypothesis assumes the defendant is guilty, the alternative hypothesis assumes the defendant is innocent. A Type I error would be not convicting a guilty person and a Type II error would be convicting an innocent person.

9.11

(a)

A Type I error is the mistake of approving an unsafe drug. A Type II error is not approving a safe drug.

(b)

The consumer groups are trying to avoid a Type I error. Copyright ©2024 Pearson Education, Inc. v


vi Chapter 10: Two-Sample Tests

9.12

(c)

The industry lobbyists are trying to avoid a Type II error.

(d)

To lower both Type I and Type II errors, the FDA can require more information and evidence in the form of more rigorous testing. This can easily translate into longer time to approve a new drug.

H0:  = 20 minutes. 20 minutes is adequate travel time between classes. H1:   20 minutes. 20 minutes is not adequate travel time between classes.

9.13

(a)

H0:  = 13 hours H1:   13 hours

(b)

A Type I error is the mistake of concluding that the mean number of hours spent by business seniors at your school is different from the 13-hour-per-week benchmark reported by The National Survey of Student Engagement when in fact it is not any different.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems vii 9.13 cont.

(c)

A Type II error is the mistake of not concluding that the mean number of hours spent by business seniors at your school is different from the 13-hour-per-week benchmark reported by The National Survey of Student Engagement when it is in fact different.

9.14

(a)

Minitab Output:

H0:  = 50,000. The mean life of a large shipment of LEDs is equal to 50,000 hours. H1:   50,000. The mean life of a large shipment of LEDs differs from 50,000 hours. Decision rule: Reject H 0 if |ZSTAT| > 1.96 Test statistic: Z STAT 

X 

 –0.67

n Decision: Since –1.96 < ZSTAT = –0.67 < 1.96, do not reject H 0 . There is not enough evidence to conclude that the mean life of a large shipment of LEDs differs from 50,000 hours. (b)

p-value = 0.505. If the population mean life of a large shipment of LEDs is indeed equal to 50,000 hours, there is a 50.5% chance of observing a test statistic at least as contradictory to the null hypothesis as the sample result.

(c)

X  Z a /2

(d)

Because the interval includes the hypothesized value of 50000 hours, you do not reject the null hypothesis. There is insufficient evidence that the mean life of a large shipment

 n

 49875  1.96

1500 64

49508    50242

Copyright ©2024 Pearson Education, Inc.


viii Chapter 10: Two-Sample Tests of LEDs differs from 50,000 hours. The same decision was reached using the two-tailed hypothesis test.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ix 9.15

(a)

(a) PHStat Output:

Z Test of Hypothesis for the Mean

Confidence Interval Estimate for the Mean

Data

Data =

Null Hypothesis

50000

Population Standard Deviation

1000 49875

Level of Significance

0.05

Sample Mean

Population Standard Deviation

1000

Sample Size

Sample Size

64

Sample Mean

64

Confidence Level

95%

49875 Intermediate Calculations

Intermediate Calculations

Standard Error of the Mean

125.0000

Standard Error of the Mean

125.0000

Z Value

-1.9600

Z Test Statistic

-1.0000

Interval Half Width

244.9955

Two-Tail Test

Confidence Interval

Lower Critical Value

-1.9600

Interval Lower Limit

49630.00

Upper Critical Value

1.9600

Interval Upper Limit

50120.00

p-Value

0.3173

Do not reject the null hypothesis H0:  = 50,000. The mean life of a large shipment of LEDs is equal to 50,000 hours. H1:   50,000. The mean life of a large shipment of LEDs differs from 50,000 hours. Decision rule: Reject H 0 if |ZSTAT| > 1.96

Copyright ©2024 Pearson Education, Inc.


x Chapter 10: Two-Sample Tests Test statistic: Z STAT 

X 

n

49,875  50,000  1.00 1,000 64

Decision: Since –1.96 < ZSTAT = –1.00 < 1.96, do not reject H 0 . There is not enough evidence to conclude that the mean life of a large shipment of LEDs differs from 50,000 hours. (b) p-value = 0.3173. If the population mean life of a large shipment of LEDs is indeed equal to 50,000 hours, there is a 31.73% chance of observing a test statistic at least as contradictory to the null hypothesis as the sample result. (c) X  Z a /2

 n

 49,875  1.96

1,000 64

49,630    50,120

(d) Because the interval includes the hypothesized value of 50,000 hours, you do not reject the null hypothesis. There is insufficient evidence that the mean life of a large shipment of LEDs differs from 50,000 hours. The same decision was reached using the two-tailed hypothesis test. (b)

Comparing the results to (a) of problem 9.14, the smaller standard deviation does not result in a rejection of the null hypothesis.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xi 9.16

(a)

PHStat output:

Data =

Null Hypothesis

1

Level of Significance

0.01

Population Standard Deviation

0.02

Sample Size

50

Sample Mean

0.995

Intermediate Calculations Standard Error of the Mean

0.002828427

Z Test Statistic

–1.767766953

Two-Tail Test Lower Critical Value

–2.575829304

Upper Critical Value

2.575829304

p-Value

0.077099872 Do not reject the null hypothesis

H0:  = 1. The mean amount of water is 1 gallon. H1:   1. The mean amount of water differs from 1 gallon. Decision rule: Reject H 0 if |ZSTAT| > 2.5758 Test statistic: Z STAT 

X 

n

0.995  1  1.7678 0.02 50

Decision: Since |ZSTAT| < 2.5758, do not reject H 0 . There is not enough evidence to conclude that the mean amount of water contained in 1-gallon bottles purchased from a nationally known water bottling company is different from 1 gallon. Copyright ©2024 Pearson Education, Inc.


xii Chapter 10: Two-Sample Tests (b)

p-value = 0.0771. If the population mean amount of water contained in 1-gallon bottles purchased from a nationally known water bottling company is actually 1 gallon, the probability of obtaining a test statistic that is more than 1.7678 standard error units away from 0 is 0.0771.

(c)

PHStat output:

Data Population Standard Deviation

0.02

Sample Mean

0.995

Sample Size

50

Confidence Level

99%

Intermediate Calculations Standard Error of the Mean

0.002828427

Z Value

–2.5758293

Interval Half Width

0.007285545

Confidence Interval Interval Lower Limit

0.987714455

Interval Upper Limit

1.002285545

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xiii 9.16

X  Z a /2

(c)

cont.

 n

 0.995  2.5758

0.02 50

0.9877    1.0023

You are 99% confident that population mean amount of water contained in 1-gallon bottles purchased from a nationally known water bottling company is somewhere between 0.9877 and 1.0023 gallons. (d)

Since the 99% confidence interval does contain the hypothesized value of 1, you will not reject H 0 . The conclusions are the same.

9.17

(a)

(a) PHStat output:

Z Test of Hypothesis for the Mean

Confidence Interval Estimate for the Mean

Data Null Hypothesis

Data =

1

Population Standard Deviation

0.015

Level of Significance

0.01

Sample Mean

0.995

Population Standard Deviation

0.015

Sample Size

50

Sample Size

50

Sample Mean

0.995

Confidence Level

99%

Intermediate Calculations Intermediate Calculations

Standard Error of the Mean

0.0021

Standard Error of the Mean

0.0021

-2.5758

-1.9600

Z Test Statistic

-2.3570

0.0055

244.9955

Two-Tail Test

Confidence Interval

Lower Critical Value

-2.5758

Interval Lower Limit

0.9895

Upper Critical Value

2.5758

Interval Upper Limit

1.0005

p-Value

0.0184

Do not reject the null hypothesis Copyright ©2024 Pearson Education, Inc.


xiv Chapter 10: Two-Sample Tests H0:  = 1. The mean amount of water is 1 gallon. H1:   1. The mean amount of water differs from 1 gallon. Decision rule: Reject H 0 if |ZSTAT| > 2.5758 Test statistic: Z STAT 

X 

n

0.995  1  2.3570 0.015 50

Decision: Since |ZSTAT| < 2.5758, do not reject H 0 . There is not enough evidence to conclude that the mean amount of water contained in 1-gallon bottles purchased from a nationally known water bottling company is different from 1 gallon. (b) p-value = 0.0184. If the population mean amount of water contained in 1-gallon bottles purchased from a nationally known water bottling company is actually 1 gallon, the probability of obtaining a test statistic that is more than 2.3570 standard error units away from 0 is 0.0184. (c) X  Z a /2

 n

 0.995  2.5758

0.015 50

0.9895    1.0005

You are 99% confident that population mean amount of water contained in 1-gallon bottles purchased from a nationally known water bottling company is somewhere between 0.9895 and 1.0005 gallon. 9.17

(d) Since the 99% confidence interval contains the hypothesized value of 1 gallon, you

cont.

will not reject H 0 . The conclusions are same. (b)

The smaller population standard deviation results in a smaller standard error of the Z test and, hence, smaller p-value. You do not reject H 0 in both Problem 9.16 and 9.17.

X –  56 – 50   2.2361 S 12 n 20

9.18

tSTAT 

9.19

d.f. = n – 1 = 20 – 1 = 19

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xv 9.20

For a two-tailed t-test with a 0.05 level of confidence, and 19 degrees of freedom, the critical values are  2.0930.

9.21

Excel Output: t Test for Hypothesis of the Mean

Data =

Null Hypothesis Level of Significance

50 0.05

Sample Size

20

Sample Mean

56

Sample Standard Deviation

12

Intermediate Calculations Standard Error of the Mean Degrees of Freedom

2.6833 19

t Test Statistic

2.2361

Two-Tail Test Lower Critical Value

-2.0930

Upper Critical Value

2.0930

p-Value

0.0375

Reject the null hypothesis

H 0 :   50

H1 :   50 Decision rule: Reject H 0 if tSTAT > 2.0930 or tSTAT < –2.0930 or p-value <   0.05 Test statistic: tSTAT 

X   56  50   2.2361 S 12 n 20 Copyright ©2024 Pearson Education, Inc.

d.f. = 19


xvi Chapter 10: Two-Sample Tests p-value = 0.0375 Decision: Since |tSTAT| < 2.0930 and the p-value of 0.0375 < 0.05   , reject H 0 . There is enough evidence to conclude that the mean amount is different from 50. 9.22

No, you should not use the t test to test the null hypothesis that  = 60 on a population that is left-skewed because the sample size (n = 16) is less than 30. The t test assumes that, if the underlying population is not normally distributed, the sample size is sufficiently large to enable the test to be valid. If sample sizes are small (n < 30), the t test should not be used because the sampling distribution does not meet the requirements of the Central Limit Theorem.

9.23

Yes, you may use the t test to test the null hypothesis that  = 60 even though the population is left-skewed because the sample size is sufficiently large (n = 160). The t test assumes that, if the underlying population is not normally distributed, the sample size is sufficiently large to enable the test statistic t to be influenced by the Central Limit Theorem.

9.24

PHStat output: t Test for Hypothesis of the Mean Data Null Hypothesis

=

3.7

Level of Significance

0.05

Sample Size

64

Sample Mean

3.57

Sample Standard Deviation

0.8

Intermediate Calculations Standard Error of the Mean

0.1

Degrees of Freedom

63 –1.3

t Test Statistic Two-Tail Test Lower Critical Value

–1.9983405

Upper Critical Value

1.9983405

p-Value

0.1983372

Do not reject the null hypothesis

(a)

H1 :   3.7 Decision rule: Reject H 0 if |tSTAT| > 1.9983 d.f. = 63 H 0 :   3.7

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xvii

X   3.57  3.7   1.3 S 0.8 n 64 Decision: Since |tSTAT| < 1.9983, do not reject H 0 . There is not enough evidence to conclude that the population mean waiting time is different from 3.7 minutes at the 0.05 level of significance. The sample size of 64 is large enough to apply the Central Limit Theorem, hence, you do not need to be concerned about the shape of the population distribution when conducting the t-test in (a). In general, the t test is appropriate for this sample size except for the case where the population is extremely skewed or bimodal. Test statistic: tSTAT 

(b)

Copyright ©2024 Pearson Education, Inc.


xviii Chapter 10: Two-Sample Tests 9.25

Excel Output: t Test for Hypothesis of the Mean

Data Null Hypothesis

=

8.2

Level of Significance

0.05

Sample Size

50

Sample Mean

8.159

Sample Standard Deviation

0.051

Intermediate Calculations Standard Error of the Mean

0.0072

Degrees of Freedom

49

t Test Statistic

-5.6846

Two-Tail Test Lower Critical Value

-2.0096

Upper Critical Value

2.0096

p-Value

0.0000

Reject the null hypothesis (a)

H 0 :   8.20

H1 :   8.20 Decision rule: Reject H 0 if |tSTAT| > 2.0096 or p-value <   0.05 Test statistic: tSTAT 

(b)

d.f. = 49

X   8.159  8.20   5.6846 S 0.051 n 50

p-value = 0.0000 Decision: Since |tSTAT| > 2.0096 and the p-value of 0.0000 < 0.05   , reject H 0 . There is not enough evidence to conclude that the mean amount is different from 8.20 ounces. The p-value is 0.0000. If the population mean is indeed 8.20 ounces, the probability of obtaining a sample mean that is more than 0.041 ounces away from 8.20 ounces is 0. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xix

Copyright ©2024 Pearson Education, Inc.


xx Chapter 10: Two-Sample Tests 9.26

PHStat output:

t Test for Hypothesis of the Mean Data Null Hypothesis = Level of Significance Sample Size Sample Mean Sample Standard Deviation

1475 0.05 100 1500 200

Intermediate Calculations Standard Error of the Mean 20.0000 Degrees of Freedom 99 t Test Statistic 1.2500 Two-Tail Test Lower Critical Value -1.9842 Upper Critical Value 1.9842 p -Value 0.2142 Do not reject the null hypothesis (a)

H 0 :   $1,475

H1 :   $1,475 Decision rule: Reject H 0 if p-value < 0.05 Test statistic: tSTAT 

(b)

X   1.2500 S n

p-value = 0.2142 Decision: Since the p-value of 0.2142 > 0.05, do not reject H 0 . There is not enough evidence to conclude that the population mean amount spent on Amazon.com by Amazon Prime member shoppers is different from $1,475. The p-value is 0.2142. If the population mean is indeed $1,475, the probability of obtaining a test statistic that is more than 1.25 standard error units away from 0 in either direction is 0.2142.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxi 9.27

Excel Output: t Test for Hypothesis of the Mean

Data Null Hypothesis

=

200

Level of Significance

0.05

Sample Size

28

Sample Mean

198.8

Sample Standard Deviation

21.4

Intermediate Calculations Standard Error of the Mean

4.0442

Degrees of Freedom

27

t Test Statistic

-0.2967

Two-Tail Test Lower Critical Value

-2.0518

Upper Critical Value

2.0518

p-Value

0.7690

Do not reject the null hypothesis (a)

H 0 :   200

H1 :   200 Decision rule: Reject H 0 if |tSTAT| > 2.0518 or p-value <   0.05 Test statistic: tSTAT 

X   198.8  200   0.2967 S 21.4 n 28

p-value = 0.7690 Decision: Since |tSTAT| < 2.0518 and the p-value of 0.7690 > 0.05, do not reject H 0 . There is not enough evidence to conclude that the population mean tread wear index is different from 200. Copyright ©2024 Pearson Education, Inc.


xxii Chapter 10: Two-Sample Tests (b)

The p-value is 0.7690. If the population mean is indeed 200, the probability of obtaining a test statistic that is more than 0.2967 standard error units away from 0 in either direction is 0.7690.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxiii 9.28

PHStat output: t Test for Hypothesis of the Mean Data Null Hypothesis = Level of Significance Sample Size Sample Mean Sample Standard Deviation

6.5 0.05 15 7.09 1.406031226

Intermediate Calculations Standard Error of the Mean 0.3630 Degrees of Freedom 14 t Test Statistic 1.6344 Two-Tail Test Lower Critical Value -2.1448 Upper Critical Value 2.1448 p -Value 0.1245 Do not reject the null hypothesis

(a)

H1 :   $6.50 Decision rule: Reject H 0 if |tSTAT| > 2.1448 or p-value < 0.05 H 0 :   $6.50

X   1.6344 S n Decision: Since |tSTAT| < 2.1448, do not reject H 0 . There is not enough evidence to conclude that the mean amount spent for lunch is different from $6.50. The p-value is 0.1245. If the population mean is indeed $6.50, the probability of obtaining a test statistic that is more than 1.6344 standard error units away from 0 in either direction is 0.4069. That the distribution of the amount spent on lunch is normally distributed. With a sample size of 15, it is difficult to evaluate the assumption of normality. However, the distribution may be fairly symmetric because the mean and the median are close in value. Also, the boxplot appears only slightly skewed so the normality assumption does not appear to be seriously violated. Test statistic: tSTAT 

(b)

(c) (d)

Copyright ©2024 Pearson Education, Inc.


xxiv Chapter 10: Two-Sample Tests 9.29

(a)

Minitab Output

H 0 :   45

H1 :   45 Decision rule: Reject H 0 if |tSTAT| > 2.0555 d.f. = 26 X   45.22  45   0.05 S 23.15 n 27 Decision: Since |tSTAT| < 2.0555, do not reject H 0 . There is not enough evidence to conclude that the mean processing time has changed from 45 days. The population distribution needs to be normal. Test statistic: tSTAT 

(b) (c)

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxv 9.29 cont.

9.30

(c)

(d)

The mean is close to the median, and the points on the normal probability plot appear to be increasing approximately in a straight line. The boxplot appears to be approximately symmetrical. Thus, you can assume that the population of processing times is approximately normally distributed. The assumption needed to conduct the t test is valid.

(a)

H0 :   2

H1 :   2 d.f. = 49 Decision rule: Reject H 0 if |tSTAT| > 2.0096 X   2.0007  2   0.1143 S 0.0446 n 50 Decision: Since |tSTAT| < 2.0096, do not reject H 0 . There is not enough evidence to conclude that the mean amount of soft drink filled is different from 2.0 liters. p-value = 0.9095. If the population mean amount of soft drink filled is indeed 2.0 liters, the probability of observing a sample of 50 soft drinks that will result in a sample mean amount of fill more different from 2.0 liters is 0.9095. Test statistic: tSTAT 

(b)

(c)

Copyright ©2024 Pearson Education, Inc.


xxvi Chapter 10: Two-Sample Tests Normal Probability Plot 2.15 2.1

Amount

2.05 2 1.95 1.9 1.85 -3

-2

-1

0

1

2

3

Z Value

9.30 cont.

(d)

The normal probability plot suggests that the data are rather normally distributed. Hence, the results in (a) are valid in terms of the normality assumption.

(e) Time Series Plot 2.15 2.1

Amount

2.05 2 1.95 1.9 1.85 1.8 1.75 1

4

7

10 13 16 19 22 25 28 31 34 37 40 43 46 49

The time series plot of the data reveals that there is a downward trend in the amount of soft drink filled. This violates the assumption that data are drawn independently from a normal population distribution because the amount of fill in consecutive bottles appears to be closely related. As a result, the t test in (a) becomes invalid.

9.31

(a)

PHStat output: Data Null Hypothesis

=

Level of Significance

20 0.05

Sample Size

50

Sample Mean

43.04

Sample Standard Deviation

41.92605736

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxvii

Intermediate Calculations Standard Error of the Mean

5.929239893

Degrees of Freedom t Test Statistic

49 3.885826921

Two-Tail Test Lower Critical Value

–2.009575199

Upper Critical Value

2.009575199

p-Value

0.000306263 Reject the null hypothesis

H1 :   20 Decision rule: Reject H 0 if tSTAT > 2.0096 d.f. = 49 X   43.04  20   3.8858 Test statistic: tSTAT  S 41.9261 n 50 Decision: Since tSTAT > 2.0096, reject H 0 . There is enough evidence to conclude that the H 0 :   20

mean number of days is different from 20.

Copyright ©2024 Pearson Education, Inc.


xxviii Chapter 10: Two-Sample Tests 9.31 cont.

(b) (c)

The population distribution needs to be normal. Normal Probability Plot 180 160 140

Days

120 100 80 60 40 20 0 -3

-2

-1

0

1

2

3

Z Value

(d)

(a)

H 0 :   8.46

H1 :   8.46 Decision rule: Reject H 0 if |tSTAT| > 2.0106 d.f. = 48 X   8.4209  8.46  = –5.9355 S 0.0461 n 49 Decision: Since |tSTAT| > 2.0106, reject H 0 . There is enough evidence to conclude that mean widths of the troughs is different from 8.46 inches. The population distribution needs to be normal. Test statistic: tSTAT 

(b) (c)

Normal Probability Plot 8.55 8.5

Width

9.32

The normal probability plot indicates that the distribution is skewed to the right. Even though the population distribution is probably not normally distributed, the result obtained in (a) should still be valid due to the Central Limit Theorem as a result of the relatively large sample size of 50.

8.45 8.4 8.35 8.3 -3

-2

-1

0

1

2

Z Value

Copyright ©2024 Pearson Education, Inc.

3


Solutions to End-of-Section and Chapter Review Problems xxix 9.32 cont.

(c) Box-and-whisker Plot

Width

8.3

9.33

8.35

8.4

8.45

8.5

(d)

The normal probability plot and the boxplot indicate that the distribution is skewed to the left. Even though the population distribution is not normally distributed, the result obtained in (a) should still be valid due to the Central Limit Theorem as a result of the relatively large sample size of 49.

(a)

H0 :   0

H1 :   0 Decision rule: Reject H 0 if |tSTAT| > 1.9842 d.f. = 99 X   0  0.00023   1.3563 S 0.00170 n 100 Decision: Since |tSTAT| < 1.9842, do not reject H 0 . There is not enough evidence to conclude that the mean difference is different from 0.0 inches. S  0.001696  X t  -0.00023  1.9842   –0.0005665    0.0001065 n 100   You are 95% confident that the mean difference is somewhere between –0.0005665 and 0.0001065 inches. Since the 95% confidence interval contains 0, you do not reject the null hypothesis in part (a). Hence, you will make the same decision and arrive at the same conclusion as in (a). In order for the t test to be valid, the data are assumed to be independently drawn from a population that is normally distributed. Since the sample size is 100, which is considered quite large, the t distribution will provide a good approximation to the sampling distribution of the mean as long as the population distribution is not very skewed. Test statistic: tSTAT 

(b)

(c) (d)

Box-and-whisker Plot

Error

-0.006

-0.004

-0.002

0

0.002

0.004

0.006

Copyright ©2024 Pearson Education, Inc.


xxx Chapter 10: Two-Sample Tests

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxxi 9.33 cont.

(d)

The boxplot suggests that the data has a distribution that is skewed slightly to the right. Given the relatively large sample size of 100 observations, the t distribution should still provide a good approximation to the sampling distribution of the mean.

9.34

(a)

H 0 :   5.5

H1 :   5.5 Decision rule: Reject H 0 if |tSTAT| > 2.680 d.f. = 49

(c)

X   5.5014  5.5   0.0935 S 0.1058 n 50 Decision: Since |tSTAT| < 2.680, do not reject H 0 . There is not enough evidence to conclude that the mean amount of tea per bag is different from 5.5 grams. s 0.1058 5.46<  5.54 X t  5.5014  2.6800  n 50 With 99% confidence, you can conclude that the population mean amount of tea per bag is somewhere between 5.46 and 5.54 grams. The conclusions are the same.

(a)

Excel Output:

Test statistic: tSTAT 

(b)

9.35

t Test for Hypothesis of the Mean Data =

Null Hypothesis Level of Significance

121 0.05

Sample Size

30

Sample Mean

115.4

Sample Standard Deviation

56.7089

Intermediate Calculations Standard Error of the Mean Degrees of Freedom t Test Statistic

10.3536 29 -0.5409

Two-Tail Test

Copyright ©2024 Pearson Education, Inc.


xxxii Chapter 10: Two-Sample Tests Lower Critical Value

-2.0452

Upper Critical Value

2.0452

p-Value

0.5927

Do not reject the null hypothesis

H 0 :   121 H1 :   121 Decision rule: Reject H 0 if p-value <   0.05 . Test statistic: tSTAT 

X   115.4  121   0.5409 S 56.7089 n 30

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxxiii 9.35 cont.

(a)

(b) (c)

Decision: Since p-value = 0.5927 > 0.05, do not reject H 0 . There no is evidence to conclude that population mean time spent per day accessing the internet is different from 121 minutes. The population distribution needs to be normal. You could construct charts and observe their appearance. For small- or moderate-sized data sets, construct a stem-and-leaf display or a boxplot. For large data sets, plot a histogram or polygon. You could also compute descriptive numerical measures and compare the characteristics of the data with the theoretical properties of the normal distribution. Compare the mean and median and see if the interquartile range is approximately 1.33 times the standard deviation and if the range is approximately 6 times the standard deviation. Evaluate how the values in the data are distributed. Determine whether approximately two-thirds of the values lie between the mean and ±1 standard deviation. Determine whether approximately four-fifths of the values lie between the mean and ±1.28 standard deviations. Determine whether approximately 19 out of every 20 values lie between the mean ±2 standard deviations.

(d) Minutes Mean

Five-Number Summary

115.4

Minimum

4

Standard Deviation

56.7089

First Quartile

59

Count

30

Median

137.5

Third Quartile

166

Maximum

189

Copyright ©2024 Pearson Education, Inc.


xxxiv Chapter 10: Two-Sample Tests 9.35 cont.

(d)

Both the boxplot and the normal probability plot indicate that the distribution is leftskewed. Hence, the t-test in (a) might not be valid. 9.36

p-value = 1  0.9772 = 0.0228

9.37

Since the p-value = 0.0228 is less than  = 0.05, reject H0.

9.38

p-value = 0.0838

9.39

Since the p-value = 0.0838 is greater than  = 0.01, do not reject H0

9.40

p-value = P  Z  2.38  0.9913

9.41

Since the p-value = 0.9913 > 0.05, do not reject the null hypothesis.

9.42

t = 2.7638

9.43

Since tSTAT = 1.79 < 2.7638, do not reject H0.

9.44

t = –2.5280

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxxv 9.45

Since tSTAT = –1.15 > –2.5280, do not reject H0

Copyright ©2024 Pearson Education, Inc.


xxxvi Chapter 10: Two-Sample Tests 9.46

(a)

H 0 :   8000

H1 :   8000 Decision rule: Reject H 0 if p-value < 0.05 d.f. = 63 Test statistic: tSTAT 

(b)

X   2.69 S n

p-value = 0.005 Decision: tSTAT = 2.69 > 1.6694, reject H 0 . There is evidence to conclude that the population mean bus miles is more than 8000 bus miles. The p-value is 0.005 < 0.05. The probability of getting a tSTAT statistic greater than 2.69 given that the null hypothesis is true, is 0.005.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxxvii 9.47

(a)

Excel Output: t Test for Hypothesis of the Mean

Data =

Null Hypothesis

98

Level of Significance

0.05

Sample Size

100

Sample Mean

117

Sample Standard Deviation

25

Intermediate Calculations Standard Error of the Mean Degrees of Freedom

2.5000 99

t Test Statistic

7.6000

Upper-Tail Test Upper Critical Value

1.6604

p-Value

0.0000

Reject the null hypothesis

H 0 :   95

H1 :   95 Decision rule: Reject H 0 if p-value > 0.05 d.f. = 99 Test statistic: tSTAT 

(b)

X   117  98   7.6000 S 25 n 100

p-value = 0.000 Decision: tSTAT = 7.6 > 1.6604, reject H 0 . There is evidence to conclude that the population mean cost to repair is more than $95. The p-value is 0.0000 < 0.05. The probability of getting a tSTAT statistic less than 7.6000 given that the null hypothesis is true, is 0.000.

Copyright ©2024 Pearson Education, Inc.


xxxviii Chapter 10: Two-Sample Tests 9.48

(a)

Minitab Output:

H 0 :   30

H1 :   30 Decision rule: Reject H 0 if p-value < 0.01 d.f. = 859 Test statistic: tSTAT 

(b)

9.49

X   –10.58 S n

p-value = 0.000 Decision: tSTAT = –10.58 < –2.3307, reject H 0 . p-value = 0.000 < 0.01, reject H 0 . There is evidence to conclude that the population mean wait time is less than 30 minutes. The probability of getting a sample mean of 24.05 minutes or less if the population mean is 30 minutes is 0.000.

H0:   25 min.The mean delivery time is not less than 23 minutes. H1:  < 25 min.The mean delivery time is less than 23 minutes. (a) Decision rule: If tSTAT < – 1.6896, reject H0. X –  19.6 – 23   3.4 Test statistic: tSTAT  S 6 n 36 Decision: Since tSTAT = –3.4 is less than –1.6896, reject H0. There is enough evidence to conclude the population mean delivery time has been reduced below the previous value of 25 minutes. (b) Decision rule: If p value < 0.05, reject H0. p value = 0.0068 Decision: Since p value = 0.0008 is less than  = 0.05, reject H0. There is enough evidence to conclude the population mean delivery time has been reduced below the previous value of 23 minutes. (c) The probability of obtaining a sample whose mean is 19.6 minutes or less when the null hypothesis is true is 0.0008. (d) The conclusions are the same.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxxix 9.50

(a)

Excel Output: t Test for Hypothesis of the Mean

Data =

Null Hypothesis

85.5

Level of Significance

0.01

Sample Size

133

Sample Mean

98

Sample Standard Deviation

9

Intermediate Calculations Standard Error of the Mean Degrees of Freedom

0.7804 132

t Test Statistic

16.0174

Upper-Tail Test Upper Critical Value

2.3549

p-Value

0.0000

Reject the null hypothesis

H 0 :   85.5

H1 :   85.5 Decision rule: Reject H 0 if p-value < 0.01 d.f. = 132 Test statistic: tSTAT 

(b)

X   98  85.5   16.0174 S 9 n 133

p-value = 0.028 Decision: tSTAT = 16.0174 > 2.35493, reject H 0 . p-value = 0.000 < 0.01, reject H 0 . There is enough evidence to conclude that the population mean one-time gift donation is greater than $85.50. The probability of getting a sample mean of $98 or more if the population mean is $85.50 is 0.0000. Copyright ©2024 Pearson Education, Inc.


xl Chapter 10: Two-Sample Tests

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xli 9.51

(a)

Excel Output: t Test for Hypothesis of the Mean

Data =

Null Hypothesis

4

Level of Significance

0.05

Sample Size

100

Sample Mean

3.7

Sample Standard Deviation

2.5

Intermediate Calculations Standard Error of the Mean Degrees of Freedom

0.2500 99

t Test Statistic

-1.2000

Lower-Tail Test Lower Critical Value

-1.6604

p-Value

0.1165

Do not reject the null hypothesis

H0 :   4

H1 :   4 Decision rule: Reject H 0 if p-value < 0.05 d.f. = 99 Test statistic: tSTAT 

(b)

X   3.7  4   1.2 S 2.5 n 100

p-value = 0.1165 Decision: tSTAT = –1.2 > –1.6604, do not reject H 0 . There is not enough evidence to conclude that the population mean wait time to check out is less than 4 minutes. Decision p-value = 0.1165 > 0.05, do not reject H 0 . There is not enough evidence to conclude that the population mean wait time to check out is less than 4 minutes. Copyright ©2024 Pearson Education, Inc.


xlii Chapter 10: Two-Sample Tests (c)

The probability of getting a sample mean of 3.70 minutes or less if the population mean is 4 minutes is 0.1165. The conclusions are the same

(d)

X 88 = 0.22  n 400

9.52

p=

9.53

Z STAT 

 1.3856 0.25  0.75 n 400 88  400  0.25 X  n or Z STAT    1.3856 n 1    400  0.25 0.75

9.54

H0:  = 0.20 H1:   0.20 Decision rule: If Z < –1.96 or Z > 1.96, reject H0. p  0.22  0.20 Test statistic: Z STAT  = 1.00   1    0.20  0.8  n 400 Decision: Since Z = 1.00 is between the critical bounds of  1.96, do not reject H0.

9.55

(a)

p 

 1   

0.22  0.25

Excel Output: Z Test of Hypothesis for the Proportion

Data =

Null Hypothesis Level of Significance

0.7 0.05

Number of Items of Interest

78

Sample Size

125

Intermediate Calculations Sample Proportion

0.624

Standard Error

0.0410

Z Test Statistic

-1.8542

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xliii Two-Tail Test Lower Critical Value

-1.9600

Upper Critical Value

1.9600

p-Value

0.0637

Do not reject the null hypothesis H0:  = 0.70 H1:   0.70 Decision rule: p-value < 0.05, reject H0. 78  0.70 p   125  1.8542 Test statistic: Z STAT  p-value = 0.0637 0.70(1  0.70)  1    125 n Decision: Since p-value = 0.0637 > 0.05, do not reject H0. There is not enough evidence that the proportion of college unpaid interns that received full-time job offers postgraduation is different from 0.70.

Copyright ©2024 Pearson Education, Inc.


xliv Chapter 10: Two-Sample Tests 9.55 cont.

(b)

Minitab Output: Z Test of Hypothesis for the Proportion

Data =

Null Hypothesis Level of Significance

0.7 0.05

Number of Items of Interest

60

Sample Size

125

Intermediate Calculations Sample Proportion

0.48

Standard Error

0.0410

Z Test Statistic

-5.3675

Two-Tail Test Lower Critical Value

-1.9600

Upper Critical Value

1.9600

p-Value

0.0000

Reject the null hypothesis H0:  = 0.70 H1:   0.70 Decision rule: p-value < 0.05, reject H0. 60  0.70  5.3675 p-value = 0.0000 Test statistic: Z STAT  125 0.70(1  0.70) 125 Decision: Since p-value = 0.0000 < 0.05, reject H0. There is evidence that the proportion of college unpaid interns that received full-time job offers post-graduation is different from 0.70. The conclusions are not the same.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xlv 9.56

(a)

Excel Output: Z Test of Hypothesis for the Proportion

Data =

Null Hypothesis Level of Significance

0.651 0.05

Number of Items of Interest

60

Sample Size

100

Intermediate Calculations Sample Proportion

0.6

Standard Error

0.0477

Z Test Statistic

-1.0700

Lower-Tail Test Lower Critical Value

-1.6449

p-Value

0.1423

Do not reject the null hypothesis H0:  ≥ 0.651 H1:  < 0.651 Decision rule: p-value < 0.05, reject H0. 60  0.651 p  100 Z    1.0700 Test statistic: STAT 0.651(1  0.651)  1    100 n

p-value = 0.1423

Decision: Since Z STAT  1.0700  1.6449 or p-value = 0.1423 > 0.05, do not reject H0. There is no evidence to show that less than 65.1% of students at your university use the Chrome web browser.

Copyright ©2024 Pearson Education, Inc.


xlvi Chapter 10: Two-Sample Tests 9.56 cont.

(b)

Excel Output: Z Test of Hypothesis for the Proportion

Data =

Null Hypothesis

0.651

Level of Significance

0.05

Number of Items of Interest

360

Sample Size

600

Intermediate Calculations Sample Proportion

0.6

Standard Error

0.0195

Z Test Statistic

-2.6209

Lower-Tail Test Lower Critical Value

-1.6449

p-Value

0.0044

Reject the null hypothesis H0:   0.651 H1:  > 0.651 Decision rule: p-value < 0.05, reject H0. 360  0.651 p  600  2.6209 Test statistic: Z STAT   1    0.651(1  0.651) 600 n

(c) (d)

p-value = 0.0044

Decision: Since Z STAT  2.6209  1.6449 or p-value = 0.0044 < 0.05, Reject H0. There is evidence to show that less than 65.1% of students at your university use the Chrome web browser. The sample size had an effect on being able to reject the null hypothesis. You would be very unlikely to reject the null hypothesis with a sample of 20.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xlvii 9.57

PHStat output: Z Test of Hypothesis for the Proportion

Data Null Hypothesis

=

0.55

Level of Significance

0.05

Number of Items of Interest

25

Sample Size

45

Intermediate Calculations Sample Proportion

0.555555556

Standard Error

0.0742

Z Test Statistic

0.0749

Two-Tail Test Lower Critical Value

-1.9600

Upper Critical Value

1.9600

p-Value

0.9403

Do not reject the null hypothesis H0:   0.55 H1:   0.55 Decision rule: If ZSTAT < –1.96 or ZSTAT > 1.96, reject H0. 25  0.55 p  45   0.0749 Test statistic: Z STAT   1    0.55 1  0.55  n 45 Decision: Since ZSTAT = 0.0749 is between the two critical bounds, do not reject H0. There is not enough evidence that the proportion of females in this position at this medical center is different from what would be expected in the general workforce.

Copyright ©2024 Pearson Education, Inc.


xlviii Chapter 10: Two-Sample Tests 9.58

Excel Output: Z Test of Hypothesis for the Proportion

Data Null Hypothesis

=

0.7

Level of Significance

0.05

Number of Items of Interest

144

Sample Size

200

Intermediate Calculations Sample Proportion

0.72

Standard Error

0.0324

Z Test Statistic

0.6172

Two-Tail Test

Lower Critical Value

1.9600

Upper Critical Value

1.9600

p-Value

0.5371

Do not reject the null hypothesis H0:  = 0.70 H1:   0.70 Decision rule: p-value < 0.05, reject H0. Test statistic: Z STAT 

p 

 1    n

144  0.7 200  0.6172 0.7(1  0.7) 200

p-value = 0.5371

Decision: Since Z STAT  0.6172  1.96 or p-value = 0.5371 > 0.05, do not reject H0. There is no evidence the proportion of workers who find it easy or somewhat easy to find adequate workspace to do work at home is not different from 0.70. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xlix

Copyright ©2024 Pearson Education, Inc.


l Chapter 10: Two-Sample Tests 9.59

PHStat output: Z Test of Hypothesis for the Proportion

Data Null Hypothesis

=

0.2

Level of Significance

0.05

Number of Items of Interest

155

Sample Size

500

Intermediate Calculations Sample Proportion

0.31

Standard Error

0.0179

Z Test Statistic

6.1492

Upper-Tail Test Upper Critical Value

1.6449

p-Value

0.0000

Reject the null hypothesis (a)

(b)

H0:   0.2H1:  > 0.2 Decision rule: If ZSTAT > 1.6449 or p-value < 0.05, reject H0. 155  0.2 p   500  6.1492 Test statistic: Z STAT  0.2(1  0.2)  1    500 n Decision: Since ZSTAT = 6.1492 is larger than the critical bound of 1.6449, reject H0. There is enough evidence to conclude that more than 20% of the customers would upgrade to a new cellphone. The manager in charge of promotional programs concerning residential customers can use the results in (a) to try to convince potential customers to upgrade to a new cellphone since more than 20% of all potential customers will do so based on the conclusion in (a).

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems li 9.60

Excel Output: Z Test of Hypothesis for the Proportion

Data Null Hypothesis

=

0.2

Level of Significance

0.05

Number of Items of Interest

1095

Sample Size

4277

Intermediate Calculations Sample Proportion

0.256020575

Standard Error

0.0061

Z Test Statistic

9.1592

Upper-Tail Test Upper Critical Value

1.6449

p-Value

0.0000

Reject the null hypothesis H0:   0.20 H1:  > 0.20 Decision rule: p-value < 0.05, reject H0. 1095  0.20 p  4277 Z   9.1592 Test statistic: STAT  1    0.20(1  0.20) 4277 n

p-value = 0.0000

Decision: Since Z STAT  9.1592  1.6449 or p-value = 0.0000 < 0.05, Reject H0. There is evidence that the percentage is greater than 20%. 9.61

The null hypothesis represents the status quo or the hypothesis that is to be disproved. The null hypothesis includes an equal sign in its definition of a parameter of interest. The alternative hypothesis is the opposite of the null hypothesis and usually represents taking an action. The alternative hypothesis includes either a less than sign, a not equal to sign, or a greater than sign in its definition of a parameter of interest. Copyright ©2024 Pearson Education, Inc.


lii Chapter 10: Two-Sample Tests 9.62

A Type I error represents rejecting a true null hypothesis, while a Type II error represents not rejecting a false null hypothesis.

9.63

The power of a test is the probability that the null hypothesis will be rejected when the null hypothesis is false.

9.64

In a one-tailed test for a mean or proportion, the entire rejection region is contained in one tail of the distribution. In a two-tailed test, the rejection region is split into two equal parts, one in the lower tail of the distribution, and the other in the upper tail.

9.65

The p-value is the probability of obtaining a test statistic equal to or more extreme than the result obtained from the sample data, given that the null hypothesis is true. Assuming a two-tailed test is used, if the hypothesized value for the parameter does not fall into the confidence interval, then the null hypothesis can be rejected.

9.66

9.67

The following are the 6-step critical value approach to hypothesis testing: (1) State the null hypothesis H0. State the alternative hypothesis H1. (2) Choose the level of significance  . Choose the sample size n. (3) Determine the appropriate statistical technique and corresponding test statistic to use. (4) Set up the critical values that divide the rejection and nonrejection regions. (5) Collect the data and compute the sample value of the appropriate test statistic. (6) Determine whether the test statistic has fallen into the rejection or the nonrejection region. The computed value of the test statistic is compared with the critical values for the appropriate sampling distribution to determine whether it falls into the rejection or nonrejection region. Make the statistical decision. If the test statistic falls into the nonrejection region, the null hypothesis H0 cannot be rejected. If the test statistic falls into the rejection region, the null hypothesis is rejected. Express the statistical decision in terms of a particular situation.

9.68

The following are the 6-step p-value approach to hypothesis testing: (1) State the null hypothesis, H0, and the alternative hypothesis, H1. (2) Choose the level of significance, α, and the sample size, n. (3) Determine the appropriate test statistic and the sampling distribution. (4) Collect the sample data, compute the value of the test statistic. (5) compute the p-value. (6) Make the statistical decision and state the managerial conclusion. If the p-value is greater than or equal to α, you do not reject the null hypothesis, H0. If the p-value is less than α, you reject the null hypothesis.

9.69

(a) (b)

(c)

(d)

H 0 :   0.6

H1 :   0.6

The level of significance is the probability of committing a Type I error, which is the probability of concluding that the population proportion of web page visitors preferring the new design is not 0.60 when in fact 60% of the population proportion of web page visitors prefer the new design. The risk associated with Type II error is the probability of not rejecting the claim that 60% of the population proportion of web page visitors prefer the new design when it should be rejected. If you reject the null hypothesis for a p-value of 0.20, there is a 20% probability that you may have incorrectly concluded that the population proportion of web page visitors preferring the new design is not 0.60 when in fact 60% of the population proportion of web page visitors prefer the new design. The argument for raising the level of significance might be that the consequences of incorrectly concluding the proportion is not 60% are deemed not very severe. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems liii (e) (f)

9.70

(a) (b) (c)

Before raising the level of significance of a test, you have to genuinely evaluate whether the cost of committing a Type I error is really not as bad as you have thought. If the p-value is actually 0.12, you will be more confident about rejecting the null hypothesis. If the p-value is 0.01, you will be even more confident that a Type I error is much less likely to occur. A Type I error occurs when a firm is predicted to be a bankrupt firm when it will not. A Type II error occurs when a firm is predicted to be a non-bankrupt firm when it will go bankrupt. The executives are trying to avoid a Type I error by adopting a very stringent decision criterion. Only firms that show significant evidence of being in financial stress will be predicted to go bankrupt within the next two years at the chosen level of the possibility of making a Type I error.

Copyright ©2024 Pearson Education, Inc.


liv Chapter 10: Two-Sample Tests 9.70 cont.

(d)

If the revised model results in more moderate or large Z scores, the probability of committing a Type I error will increase. Many more of the firms will be predicted to go bankrupt than will go bankrupt. On the other hand, the revised model that results in more moderate or large Z scores will lower the probability of committing a Type II error because few firms will be predicted to go bankrupt than will actually go bankrupt.

9.71

(a)

Excel Output: t Test for Hypothesis of the Mean

Data =

Null Hypothesis

9.48

Level of Significance

0.05

Sample Size

50

Sample Mean

10.12

Sample Standard Deviation

3.2

Intermediate Calculations Standard Error of the Mean

0.4525

Degrees of Freedom

49

t Test Statistic

1.4142

Upper-Tail Test Upper Critical Value

1.6766

p-Value

0.0818

Do not reject the null hypothesis

H 0 :   9.48

H1 :   9.48 Decision rule: Reject H 0 if p-value > 0.05 d.f. = 99 Test statistic: tSTAT 

X   10.12  9.48   1.4142 S 3.2 n 50

p-value = 0.0818 Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lv Decision: tSTAT = 1.4142 < 1.6766, do not reject H 0 .

Copyright ©2024 Pearson Education, Inc.


lvi Chapter 10: Two-Sample Tests 9.71 cont.

(b)

Excel Output: Z Test of Hypothesis for the Proportion

Data =

Null Hypothesis

0.16

Level of Significance

0.05

Number of Items of Interest

19

Sample Size

50

Intermediate Calculations Sample Proportion

0.38

Standard Error

0.0518

Z Test Statistic

4.2433

Two-Tail Test Lower Critical Value

-1.9600

Upper Critical Value

1.9600

p-Value

0.0000 Reject the null hypothesis

H0:  = 0.16 H1:   0.16 Decision rule: Z STAT  1.96 , reject H0. Test statistic: Z STAT 

p 

 1    n

19  0.16 50  4.2433 0.16(1  0.16) 50

Decision: Since Z STAT  4.2433  1.96 , Reject H0.

Copyright ©2024 Pearson Education, Inc.

p-value = 0.0000


Solutions to End-of-Section and Chapter Review Problems lvii

Copyright ©2024 Pearson Education, Inc.


lviii Chapter 10: Two-Sample Tests 9.72

PHStat output: t Test for Hypothesis of the Mean

Data Null Hypothesis

=

Level of Significance

8 0.05

Sample Size

60

Sample Mean

8.55

Sample Standard Deviation

1.75

Intermediate Calculations Standard Error of the Mean Degrees of Freedom

0.2259 59

t Test Statistic

2.4344

Two-Tail Test Lower Critical Value

-2.0010

Upper Critical Value

2.0010

p-Value

0.0180

Reject the null hypothesis (a)

(b)

H0:  = $8.00 H1:   $8.00 Decision rule: d.f. = 59. If p-value < 0.05, reject H0. X –  8.55  8   2.4344 Test statistic: tSTAT  S 1.75 n 60 Because tSTAT  2.4344  2.0010, reject H 0 . There is enough evidence to conclude that the mean amount spent differs from $8.00 p-value = 0.0180

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lix 9.72 cont.

(c)

Excel Output Z Test of Hypothesis for the Proportion

Data =

Null Hypothesis

0.6

Level of Significance

0.05

Number of Items of Interest

41

Sample Size

60

Intermediate Calculations Sample Proportion

0.683333333

Standard Error

0.0632

Z Test Statistic

1.3176

Upper-Tail Test Upper Critical Value

1.6449

p-Value

0.0938

Do not reject the null hypothesis H0:   0.60. H1:  > 0.60. Decision rule: If p-value < 0.05, reject H0. 41  0.6 p  60   1.3176 Test statistic: Z STAT  0.6(1  0.6)  1    60 n

p-value = 0.0938

Decision: Since Z STAT  1.3176  1.6449 and p-value > 0.05, do not reject H0. There is not sufficient evidence to conclude that more than 60% of customers say they ―definitely will‖ recommend the specialty coffee shop to family and friends.

Copyright ©2024 Pearson Education, Inc.


lx Chapter 10: Two-Sample Tests 9.72 cont.

(d)

Excel Output t Test for Hypothesis of the Mean

Data =

Null Hypothesis Level of Significance

8 0.05

Sample Size

60

Sample Mean

9.2

Sample Standard Deviation

1.75

Intermediate Calculations Standard Error of the Mean Degrees of Freedom t Test Statistic

0.2259 59 5.3115

Two-Tail Test Lower Critical Value

-2.0010

Upper Critical Value

2.0010

p-Value

0.0000

Reject the null hypothesis H0:  = $9.20 H1:   $9.20 Decision rule: d.f. = 59. If p-value < 0.05, reject H0. X –  9.20  8   5.3115 Test statistic: tSTAT  S 1.75 n 60 Because tSTAT  5.3115  2.0010, reject H 0 . There is enough evidence to conclude that the mean amount spent differs from $9.20

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxi 9.72 cont.

(e)

PHStat output:

Z Test of Hypothesis for the Proportion

Data =

Null Hypothesis

0.6

Level of Significance

0.05

Number of Items of Interest

26

Sample Size

60

Intermediate Calculations Sample Proportion

0.433333333

Standard Error

0.0632

Z Test Statistic

-2.6352

Upper-Tail Test Upper Critical Value

1.6449

p-Value

0.9958

Do not reject the null hypothesis H0:   0.60. H1:  > 0.60. Decision rule: If p-value < 0.05, reject H0. 26  0.6 p  60 Z    2.6352 Test statistic: STAT 0.6(1  0.6)  1    60 n

p-value = 0.9958

Decision: Since Z STAT  2.6352  1.6449 and p-value > 0.05, do not reject H0.

Copyright ©2024 Pearson Education, Inc.


lxii Chapter 10: Two-Sample Tests 9.73

(a)

Excel Output: t Test for Hypothesis of the Mean

Data =

Null Hypothesis Level of Significance

100 0.05

Sample Size

75

Sample Mean

133.7

Sample Standard Deviation

37.11

Intermediate Calculations Standard Error of the Mean Degrees of Freedom t Test Statistic

4.2851 74 7.8645

Upper-Tail Test Upper Critical Value

1.6657

p-Value

0.0000 Reject the null hypothesis

H0:   $100H1:  > $100 Decision rule: d.f. = 74. If tSTAT > 1.6657, reject H0. X –  $133.70 – $100   7.8645 Test statistic: tSTAT  S $37.11 n 75 Decision: Since the test statistic of tSTAT = 7.8645 is greater than the critical bound of 1.6657, reject H0. There is evidence to conclude that the mean reimbursement for office visits to doctors paid by Medicare was more than $100.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxiii 9.73 cont.

(b)

Excel Output: Z Test of Hypothesis for the Proportion

Data =

Null Hypothesis

0.1

Level of Significance

0.05

Number of Items of Interest

12

Sample Size

75

Intermediate Calculations Sample Proportion

0.16

Standard Error

0.0346

Z Test Statistic

1.7321

Two-Tail Test Lower Critical Value

-1.9600

Upper Critical Value

1.9600

p-Value

0.0833

Do not reject the null hypothesis H0:   0.10. At most 10% of all reimbursements for office visits to doctors paid by Medicare are incorrect. H1:  > 0.10. More than 10% of all reimbursements for office visits to doctors paid by Medicare are incorrect. Decision rule: If ZSTAT > 1.9600, reject H0. 12  0.10 p  75 Test statistic: Z STAT    1.7321  1    0.10 1  0.10  n 75 Decision: Since ZSTAT = 1.7321 is less than the critical bound of 1.9600, do not reject H0. There is not enough evidence to conclude that more than 10% of all reimbursements for office visits to doctors paid by Medicare are incorrect. Copyright ©2024 Pearson Education, Inc.


lxiv Chapter 10: Two-Sample Tests (c)

To perform the t-test on the population mean, you must assume that the observed sequence in which the data were collected is random and that the data are approximately normally distributed.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxv 9.73 cont.

(d)

Excel Output:

t Test for Hypothesis of the Mean

Data =

Null Hypothesis

100

Level of Significance

0.05

Sample Size

75

Sample Mean

90

Sample Standard Deviation

37.11

Intermediate Calculations Standard Error of the Mean

4.2851

Degrees of Freedom t Test Statistic

74 -2.3337

Upper-Tail Test Upper Critical Value

1.6657

p-Value

0.9888

Do not reject the null hypothesis H0:   $100. The mean reimbursement for office visits to doctors paid by Medicare is at most $100. H1:  > $100. The mean reimbursement for office visits to doctors paid by Medicare is greater than $100. Decision rule: d.f. = 74. If tSTAT > 1.6657, reject H0. X –  $90 – $100   2.3337 Test statistic: tSTAT  S $37.11 n 75 Decision: Since tSTAT = –2.3337 is less than the critical bound of 1.6657, do not reject H0.

Copyright ©2024 Pearson Education, Inc.


lxvi Chapter 10: Two-Sample Tests 9.73 cont.

(e)

Excel Output: Z Test of Hypothesis for the Proportion

Data =

Null Hypothesis

0.1

Level of Significance

0.05

Number of Items of Interest

8

Sample Size

75

Intermediate Calculations Sample Proportion

0.106666667

Standard Error

0.0346

Z Test Statistic

0.1925

Two-Tail Test Lower Critical Value

-1.9600

Upper Critical Value

1.9600

p-Value

0.8474

Do not reject the null hypothesis H0:   0.10. At most 10% of all reimbursements for office visits to doctors paid by Medicare are incorrect. H1:  > 0.10. More than 10% of all reimbursements for office visits to doctors paid by Medicare are incorrect. Decision rule: If ZSTAT > 1.9600, reject H0. 8  0.10 p  75   0.1925 Test statistic: Z STAT   1    0.10 1  0.10  n 75 Decision: Since ZSTAT = 0.1925 is less than the critical bound of 1.9600, do not reject H0.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxvii 9.74

(a)

H0:   5 minutes. The mean waiting time at a bank branch in a commercial district of the city is at least 5 minutes during the 12:00 p.m. to 1 p.m. peak lunch period. H1:  < 5 minutes. The mean waiting time at a bank branch in a commercial district of the city is less than 5 minutes during the 12:00 p.m. to 1 p.m. peak lunch period. Decision rule: d.f. = 14. If tSTAT < –1.7613, reject H0. X –  4.2866 – 5.0  Test statistic: tSTAT  = –1.6867 S 1.637985 n 15 Decision: Since tSTAT = –1.6867 is greater than the critical bound of –1.7613, do not reject H0. There is not enough evidence to conclude that the mean waiting time at a bank branch in a commercial district of the city is less than 5 minutes during the 12:00 p.m. to 1 p.m. peak lunch period.

Copyright ©2024 Pearson Education, Inc.


lxviii Chapter 10: Two-Sample Tests 9.74 cont.

(b)

(c)

To perform the t-test on the population mean, you must assume that the observed sequence in which the data were collected is random and that the data are approximately normally distributed. Normal probability plot: Normal Probability Plot 7 6

Waiting Time

5 4 3 2 1 0 -2

-1.5

-1

-0.5

0

0.5

1

1.5

2

Z Value

9.75

(d) (e)

With the exception of one extreme point, the data are approximately normally distributed. Based on the results of (a), the manager does not have enough evidence to make that statement.

(a)

Minitab Output:

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxix 9.75 cont.

(a)

Excel Output:

H 0 :   20

H1 :   20 Decision rule: Reject H 0 if p-value < 0.05 d.f. = 49 Test statistic: tSTAT 

(b)

X   –6.39 S n

p-value = 0.000 Decision: tSTAT = –6.39 < –1.6766, reject H 0 or p-value = 0.000 < 0.05, reject H 0 There is evidence to conclude that the population mean answer time is less than 20 minutes. The population distribution needs to be normal.

Copyright ©2024 Pearson Education, Inc.


lxx Chapter 10: Two-Sample Tests 9.75 cont.

9.76

(c)

(d)

The mean is close to the median, and the points on the normal probability plot appear to be increasing approximately in a straight line. The boxplot appears to be approximately symmetrical. Thus, you can assume that the population of processing times is approximately normally distributed. The assumption needed to conduct the t test is valid.

(a)

H 0 :   0.35

H1 :   0.35 Decision rule: Reject H 0 if tSTAT < 1.690 d.f. = 35

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxxi

X –  0.3167 – 0.35   1.4735 S 0.1357 n 36 Decision: Since tSTAT > 1.690, do not reject H 0 . There is not enough evidence to conclude that the mean moisture content for Boston shingles is less than 0.35 pounds per 100 square feet. p-value = 0.0748. If the population mean moisture content is in fact no less than 0.35 pounds per 100 square feet, the probability of observing a sample of 36 shingles that will result in a sample mean moisture content of 0.3167 pounds per 100 square feet or less is 0.0748. H 0 :   0.35 H1 :   0.35 Decision rule: Reject H 0 if tSTAT < –1.6973 d.f. = 30 Test statistic: tSTAT 

9.76 cont.

(a)

(b)

(c)

X –  0.2735 – 0.35   3.1003 S 0.1373 n 31 Decision: Since tSTAT < –1.6973, reject H 0 . There is enough evidence to conclude that the mean moisture content for Vermont shingles is less than 0.35 pounds per 100 square feet. p-value = 0.0021. If the population mean moisture content is in fact no less than 0.35 pounds per 100 square feet, the probability of observing a sample of 31 shingles that will result in a sample mean moisture content of 0.2735 pounds per 100 square feet or less is 0.0021. In order for the t test to be valid, the data are assumed to be independently drawn from a population that is normally distributed. Since the sample sizes are 36 and 31, respectively, which are considered quite large, the t distribution will provide a good approximation to the sampling distribution of the mean as long as the population distribution is not very skewed. Test statistic: tSTAT 

(d)

(e)

(f) Box-and-whisker Plot (Boston)

0

0.2

0.4

0.6

0.8

Copyright ©2024 Pearson Education, Inc.

1


lxxii Chapter 10: Two-Sample Tests 9.76 cont.

(f) Box-and-whisker Plot (Vermont)

0

9.77

0.2

0.4

0.6

0.8

1

(g)

Both boxplots suggest that the data are skewed slightly to the right, more so for the Boston shingles. However, the very large sample sizes mean that the results of the t test are relatively insensitive to the departure from normality.

(a)

H 0 :   3150

(b)

(c)

H1 :   3150 Decision rule: Reject H 0 if |tSTAT| > 1.9665 d.f. = 367 X –  3124.2147 – 3150   14.2497 Test statistic: tSTAT  S 34.713 n 368 Decision: Since tSTAT < 1.9665, reject H 0 . There is enough evidence to conclude that the mean weight for Boston shingles is different from 3150 pounds. p-value is virtually zero. If the population mean weight is in fact 3150 pounds, the probability of observing a sample of 368 shingles that will yield a test statistic more extreme than –14.2497 is virtually zero.

H 0 :   3700

H1 :   3700 Decision rule: Reject H 0 if |tSTAT| > 1.967 d.f. = 329

X –  3704.0424 – 3700   1.571 S 46.7443 n 330 Decision: Since |tSTAT| < 1.967, do not reject H 0 . There is not enough evidence to conclude that the mean weight for Vermont shingles is different from 3700 pounds. p-value = 0.1171. The probability of observing a sample of 330 shingles that will yield a test statistic more extreme than 1.571 is 0.1171 if the population mean weight is in fact 3700 pounds. In order for the t test to be valid, the data are assumed to be independently drawn from a population that is normally distributed. Since the sample sizes are 368 and 330, respectively, which are considered large enough, the t distribution will provide a good approximation to the sampling distribution of the mean even if the population is not normally distributed. Test statistic: tSTAT 

(d)

(e)

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxxiii 9.78

(a)

H 0 :   0.3

H1 :   0.3

t Test for Hypothesis of the Mean Data Null Hypothesis = Level of Significance Sample Size Sample Mean Sample Standard Deviation

0.3 0.05 170 0.26 0.142382504

Intermediate Calculations Standard Error of the Mean 0.0109 Degrees of Freedom 169 t Test Statistic -3.2912 Two-Tail Test Lower Critical Value -1.9741 Upper Critical Value 1.9741 p -Value 0.0012 Reject the null hypothesis Decision rule: Reject H 0 if |tSTAT| > 1.9741 d.f. = 169 X – = –3.2912, p-value = 0.0012 S n Decision: Since tSTAT < 1.9741, reject H 0 . There is enough evidence to conclude that the mean granule loss for Boston shingles is different from 0.3 grams. p-value is 0.0012. If the population mean granule loss is in fact 0.3 grams, the probability of observing a sample of 170 shingles that will yield a test statistic more extreme than –3.2912 is 0.0012. Test statistic: tSTAT 

(b)

Copyright ©2024 Pearson Education, Inc.


lxxiv Chapter 10: Two-Sample Tests 9.78 cont.

(c)

H 0 :   0.3

H1 :   0.3

t Test for Hypothesis of the Mean Data Null Hypothesis = Level of Significance Sample Size Sample Mean Sample Standard Deviation

0.3 0.05 140 0.22 0.122698672

Intermediate Calculations Standard Error of the Mean 0.0104 Degrees of Freedom 139 t Test Statistic -7.9075 Two-Tail Test Lower Critical Value -1.9772 Upper Critical Value 1.9772 p -Value 0.0000 Reject the null hypothesis Decision rule: Reject H 0 if |tSTAT| > 1.9772 d.f. = 139 X – = –7.9075 S n Decision: Since tSTAT < 1.977, reject H 0 . There is enough evidence to conclude that the mean granule loss for Vermont shingles is different from 0.3 grams. p-value is virtually zero. The probability of observing a sample of 140 shingles that will yield a test statistic more extreme than –1.977 is virtually zero if the population mean granule loss is in fact 0.3 grams. In order for the t test to be valid, the data are assumed to be independently drawn from a population that is normally distributed. Both normal probability plots indicate that the data are slightly right-skewed. Since the sample sizes are 170 and 140, respectively, which are considered large enough, the t distribution will provide a good approximation to the sampling distribution of the mean even if the population is not normally distributed. Test statistic: tSTAT 

(d)

(e)

9.79

Answers will vary

Chapter 10

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxxv 10.1

df  n1  n2  2  10  12  2  20

10.2

(a)

S p2 

(n1  1)  S12  (n2  1)  S2 2 (7)  42  (14)  52   22 (n1  1)  (n2  1) 7  14

tSTAT 

 X  X          42  34  0  3.8959 1

2

1

2

1 1 1 1  22    S p2     8 15   n1 n2  d.f. = (n1 – 1) + (n2 – 1) = 7 + 14 = 21 Decision rule: d.f. = 21. If tSTAT > 2.5177, reject H0. Decision: Since t = 3.8959 is greater than the critical bound of 2.5177, reject H0. There is enough evidence to conclude that the first population mean is larger than the second population mean.

(b) (c) (d)

10.3

Assume that you are sampling from two independent normal distributions having equal variances.

10.4

 X  X   t S  n1  n1    42  34  2.0796 22  18  151 

10.5

1

2

2 p

 1

2

df  n1  n2  2  7  6  2  11

10.6

Copyright ©2024 Pearson Education, Inc.

3.7296  1  2  12.2704


lxxvi Chapter 10: Two-Sample Tests 10.6 cont.

Decision: Since tSTAT = 2.6762 is smaller than the upper critical bounds of 2.9979, do not reject H0. There is not enough evidence of a difference in the means of the two populations. 10.7

(a)

(b)

(c)

H0: 1  2 The mean estimated amount of calories in the cheeseburger is not lower for the people who thought about the cheesecake first than for the people who thought about the organic fruit salad first. H1: 1  2 The mean estimated amount of calories in the cheeseburger is lower for the people who thought about the cheesecake first than for the people who thought about the organic fruit salad first. Type I error is the error made in concluding that the mean estimated amount of calories in the cheeseburger is lower for the people who thought about the cheesecake first than for the people who thought about the organic fruit salad first when the mean estimated amount of calories in the cheeseburger is in fact not lower for the people who thought about the cheesecake first than for the people who thought about the organic fruit salad first. Type II error is the error made in concluding that the mean estimated amount of calories in the cheeseburger is not lower for the people who thought about the cheesecake first than for the people who thought about the organic fruit salad first when the mean estimated amount of calories in the cheeseburger is in fact lower for the people who thought about the cheesecake first than for the people who thought about the organic fruit salad first.

(d)

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxxvii 10.7 cont.

(d)

(e)

10.8

(a)

(b) (c) (d)

Decision: Since tSTAT = –6.1532 is smaller than the critical bound of –2.4286, reject H0. There is evidence that the mean estimated amount of calories in the cheeseburger is lower for the people who thought about the cheesecake first than for the people who thought about the organic fruit salad first. The commercial would feature foods high in calories such as peanut butter and chocolate. Based on the results from (d), presentation of foods high in calories would decrease estimates of the amount of calories associated with a cheeseburger. Because tSTAT = 2.8990 > 1.6620 or p-value = 0.0024 < 0.05, reject H0. There is evidence that the mean amount of Walker Crisps eaten by children who watched a commercial featuring a long-standing sports celebrity endorser is higher than for those who watched a commercial for an alternative food snack. 3.4616  1  2  18.5384 The results cannot be compared because (a) is a one-tail test and (b) is a confidence interval that is comparable only to the results of a two-tail test. You would choose the commercial featuring a long-standing celebrity endorser.

Copyright ©2024 Pearson Education, Inc.


lxxviii Chapter 10: Two-Sample Tests 10.9

From PHStat, Population 1 = Traditional, Population 2 = PrePaid Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference

0

Level of Significance

0.05

Population 1 Sample Sample Size

7

Sample Mean

79.85714286

Sample Standard Deviation

7.537209285

Population 2 Sample Sample Size

12

Sample Mean

82.58333333

Sample Standard Deviation

3.918680978

Intermediate Calculations Population 1 Sample Degrees of Freedom

6

Population 2 Sample Degrees of Freedom

11

Total Degrees of Freedom

17

Pooled Variance

29.9867

Standard Error

2.6044

Difference in Sample Means

-2.7262

t Test Statistic

-1.0468

Two-Tail Test Lower Critical Value Copyright ©2024 Pearson Education, Inc.

-2.1098


Solutions to End-of-Section and Chapter Review Problems lxxix Upper Critical Value

2.1098

p-Value

0.3099

Do not reject the null hypothesis

(a)

(b)

(c)

A pooled-variance t test revealed that there was no significant difference between mean rating between the two types of cellular providers. Because tSTAT = –1.0468 or p-value = 0.3099, one would not reject H0. The p-value of 0.3099 is well above the 0.05 significance level. This means that the probability of obtaining an equal or larger value than the observed test statistic, when H0 is true, is 30.99%. It is necessary to assume both populations associated with the ratings data from the two types of cellular providers are normally distributed.

Copyright ©2024 Pearson Education, Inc.


lxxx Chapter 10: Two-Sample Tests 10.9 cont.

(d)

From PHStat Confidence Interval Estimate for the Difference Between Two Means

Data Confidence Level

95%

Intermediate Calculations Degrees of Freedom

17

t Value

2.1098

Interval Half Width

5.4947

Confidence Interval

(e)

Interval Lower Limit

-8.2209

Interval Upper Limit

2.7685

Using a 95% confidence interval, the lower limit of the average difference between the two providers is –8.2209 and the upper limit is 2.7685. Based on the results from a pooled-variance t test, one can conclude that there is no significant difference in satisfaction ratings between traditional cellular providers and prepaid cellular providers.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxxxi 10.10

(a)

(b)

(c)

The results from a pooled-variance t test revealed that there is no evidence at the 0.05 level of significance that there is a difference between the Southeast region accounting firms and the Gulf Coast accounting firms with respect to the mean number of partners. Because tSTAT = –0.24 or p-value = 0.808, do not reject H0. The p-value was 0.808, well above the 0.05 level. This means that the probability of obtaining an equal or larger value than the observed test statistic, when H0 is true, is 80.8%. The pooled-variance t test assumes that the variance associated with the two populations of accounting firms are equal and that the number of partners data are approximately normally distributed for the two populations.

Copyright ©2024 Pearson Education, Inc.


lxxxii Chapter 10: Two-Sample Tests 10.11 (a) From PHStat, Population 1 = Before Halftime, Population 2 = During or after halftime Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference

Confidence Interval Estimate 0

Level of Significance

for the Difference Between Two Means

0.05

Population 1 Sample Sample Size

Data 28

Sample Mean

5.789285714

Sample Standard Deviation

0.702028429

Population 2 Sample Sample Size

Confidence Level

Intermediate Calculations Degrees of Freedom

29

Sample Mean

5.534482759

Sample Standard Deviation

0.70926313

95%

55

t Value

2.0040

Interval Half Width

0.3747

Confidence Interval Intermediate Calculations Population 1 Sample Degrees of Freedom

27

Population 2 Sample Degrees of Freedom

28

Total Degrees of Freedom

55

Pooled Variance

0.4980

Standard Error

0.1870

Difference in Sample Means

0.2548

t Test Statistic

1.3627

Interval Lower Limit

-0.1199

Interval Upper Limit

0.6295

Two-Tail Test

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxxxiii Lower Critical Value

-2.0040

Upper Critical Value

2.0040

p-Value

0.1785

Do not reject the null hypothesis

(b)

(c)

The results from a pooled-variance t test revealed that there was no significant difference between the mean rating of the ads that ran before halftime and the ads that ran at halftime or after. Because tSTAT = 1.3627 or p-value = 0.1785, do not reject H0. The p-value of 0.1785, which is well above the 0.05 level. This means that the probability of obtaining an equal or larger value than the observed test statistic, when H0 is true, is 17.85%. Using a 95% confidence interval, the lower limit of the average difference in mean rating between the ads that ran before halftime and ads that ran during or after halftime is –0.1199 and the upper limit is 0.6295. This result means that one can be 95% confident that the mean rating difference between the ads that ran before halftime and ads that during or after halftime will be –0.1199 to 0.6295.

Copyright ©2024 Pearson Education, Inc.


lxxxiv Chapter 10: Two-Sample Tests 10.12

(a)

(b)

(c) (d)

H 0 : 1  2 Mean waiting times of Bank 1 and Bank 2 are the same. H1 : 1  2 Mean waiting times of Bank 1 and Bank 2 are different.

Since the p-value of 0.000 is less than the 5% level of significance, reject the null hypothesis. There is enough evidence to conclude that the mean waiting time is different in the two banks. p-value = 0.000. The probability of obtaining a sample that will yield a t test statistic more extreme than –4.13 is 0.000 if, in fact, the mean waiting times of Bank 1 and Bank 2 are the same. We need to assume that the two populations are normally distributed.

 X  X   t S  n1  n1    4.2867  7.1147   2.0484 3.5093 151  151  1

2

2 p

2   1 4.2292  1  2  1.4268 You are 95% confident that the difference in mean waiting time between Bank 1 and Bank 2 is between 4.2292 and 1.4268 minutes.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxxxv 10.13

H 0 : 1  2 Mean waiting times of Bank 1 and Bank 2 are the same. H1 : 1  2 Mean waiting times of Bank 1 and Bank 2 are different.

Since the p-value of 0.000 is less than the 5% level of significance, reject the null hypothesis. There is enough evidence to conclude that the mean waiting times are different in the two banks. Both t tests yield the same conclusion.

Copyright ©2024 Pearson Education, Inc.


lxxxvi Chapter 10: Two-Sample Tests 10.14

From PHStat Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference Level of Significance

0 0.05

Population 1 Sample Sample Size

20

Sample Mean

42.9135

Sample Standard Deviation

14.10057269

Population 2 Sample Sample Size

11

Sample Mean

29.21

Sample Standard Deviation

16.15989109

Intermediate Calculations Population 1 Sample Degrees of Freedom

19

Population 2 Sample Degrees of Freedom

10

Total Degrees of Freedom

29

Pooled Variance

220.3144

Standard Error

5.5717

Difference in Sample Means

13.7035

t Test Statistic

2.4595

Two-Tail Test Lower Critical Value

-2.0452 Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxxxvii Upper Critical Value

2.0452

p-Value

0.0201 Reject the null hypothesis

(a) (b)

(c)

Because tSTAT = 2.4595 > 2.0452, reject H0. There is evidence of a difference in the mean time to start a business between developed and emerging countries. p-value = 0.0201. The probability that two samples have a mean difference of 13.7035 or more is 0.0201 if there is no difference in the meantime to start a business between developed and emerging countries. You need to assume that the population distribution of the time to start a business of both developed and emerging countries is normally distributed.

Copyright ©2024 Pearson Education, Inc.


lxxxviii Chapter 10: Two-Sample Tests 10.14 cont.

(c)

From PHStat Confidence Interval Estimate for the Difference Between Two Means

Data Confidence Level

95%

Intermediate Calculations Degrees of Freedom

29

t Value

2.0452

Interval Half Width

11.3955

Confidence Interval

(d)

Interval Lower Limit

2.3080

Interval Upper Limit

25.0990

2.3080  1  2  25.0990

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxxxix 10.15

From PHStat Separate-Variances t Test for the Difference Between Two Means (assumes unequal population variances) Data Hypothesized Difference Level of Significance

0 0.05

Population 1 Sample Sample Size

20

Sample Mean

42.9135

Sample Standard Deviation

14.1006

Population 2 Sample Sample Size

11

Sample Mean

29.21

Sample Standard Deviation

16.1599

Intermediate Calculations Numerator of Degrees of Freedom

1134.4432

Denominator of Degrees of Freedom

61.5612

Total Degrees of Freedom

18.4279

Degrees of Freedom

18

Standard Error

5.8036

Difference in Sample Means

13.7035

Separate-Variance t Test Statistic

2.3612

Two-Tail Test Copyright ©2024 Pearson Education, Inc.


xc Chapter 10: Two-Sample Tests Lower Critical Value

-2.1009

Upper Critical Value

2.1009

p-Value

0.0297 Reject the null hypothesis

The results between the two analyses were approximately equal. The t test analysis without assuming equal variances also revealed a significant difference in mean time required to start a business between developed and emerging countries.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xci 10.16

From PHStat, Population 1 = IOS, Population 2 = Android Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference

0

Level of Significance

0.05

Population 1 Sample Sample Size

30

Sample Mean

175.2666667

Sample Standard Deviation

139.8368343

Population 2 Sample Sample Size

30

Sample Mean

377.2

Sample Standard Deviation

493.7347047

Intermediate Calculations Population 1 Sample Degrees of Freedom

29

Population 2 Sample Degrees of Freedom

29

Total Degrees of Freedom

58

Pooled Variance

131664.1494

Standard Error

93.6889

Difference in Sample Means t Test Statistic

-201.9333 -2.1554

Two-Tail Test Lower Critical Value

-2.0017 Copyright ©2024 Pearson Education, Inc.


xcii Chapter 10: Two-Sample Tests Upper Critical Value

2.0017

p-Value

0.0353 Reject the null hypothesis

(a)

(b)

Because tSTAT = –2.1554 < –2.0017 or p-value = 0.03535 < 0.05, reject H0. There is evidence of a difference in the mean time per day accessing the Internet via a mobile device between IOS users and Android users. You must assume that each of the two independent populations is normally distributed.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xciii 10.17

(a)

From PHStat, Population 1 = Technology, Population 2 = Financial Institutions Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference

0

Level of Significance

0.05

Population 1 Sample Sample Size

21

Sample Mean

128721.3333

Sample Standard Deviation

224301.7528

Population 2 Sample Sample Size

15

Sample Mean

52231.73333

Sample Standard Deviation

45687.56407

Intermediate Calculations Population 1 Sample Degrees of Freedom

20

Population 2 Sample Degrees of Freedom

14

Total Degrees of Freedom

34

Pooled Variance

30454366914.2824

Standard Error

58995.7547

Difference in Sample Means

76489.6000

t Test Statistic

1.2965

Two-Tail Test

Copyright ©2024 Pearson Education, Inc.


xciv Chapter 10: Two-Sample Tests Lower Critical Value

-2.0322

Upper Critical Value

2.0322

p-Value

0.2035

Do not reject the null hypothesis There is insufficient evidence that there is a significant difference at the 0.05 significance level in mean brand value between the technology and financial sectors. Because tSTAT = 1.2965 or p-value = 0.2035, do not reject H0.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xcv 10.17 cont.

(b)

From PHStat, Population 1 = Technology, Population 2 = Financial Institutions Separate-Variances t Test for the Difference Between Two Means (assumes unequal population ariances) Data Hypothesized Difference

0

Level of Significance

0.05

Population 1 Sample Sample Size

21

Sample Mean

128721.3333

Sample Standard Deviation

224301.7528

Population 2 Sample Sample Size

15

Sample Mean

52231.73333

Sample Standard Deviation

45687.5641

Intermediate Calculations Numerator of Degrees of Freedom

6425880054317410000.0000

Denominator of Degrees of Freedom

288370096112897000.0000

Total Degrees of Freedom Degrees of Freedom

22.2834 22

Standard Error

50348.1078

Difference in Sample Means

76489.6000

Separate-Variance t Test Statistic

Two-Tail Test

Copyright ©2024 Pearson Education, Inc.

1.5192


xcvi Chapter 10: Two-Sample Tests Lower Critical Value

-2.0739

Upper Critical Value

2.0739

p-Value

0.1430

Do not reject the null hypothesis There is insufficient evidence that there is a significant difference at the 0.05 significance level in mean brand value between the technology and financial sectors. Because tSTAT = 1.5192 or p-value = 0.1430, do not reject H0. (c)

Both t tests led to the same conclusion to not reject H0. For the t test not assuming unequal variances, tSTAT = 1.5192 with a p-value of 0.1430. This p-value was slightly lower than the p-value associated with the pooled-variance t test from 10.17 (a). Both p-values were above the 0.05 significance level.

10.18

The degrees of freedom is 20 – 1, or 19.

10.19

The degrees of freedom is 20 – 1, or 19.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xcvii 10.20

(a)

 –1.5566  3.2772.

Because tSTAT = –3.2772 < –2.306 or  1.424     9  p-value = 0.0112 < 0.05, reject H0. There is enough evidence of a difference in the mean summated ratings between the two brands. (b) You must assume that the distribution of the differences between the two ratings is approximately normal. (c)p-value = 0.0112. The probability of obtaining a mean difference in ratings that results in a test statistic that deviates from 0 by 3.2772 or more in either direction is 0.0112 if there is no difference in the mean summated ratings between the two brands. (d) 2.6501  D  0.4610. You are 95% confident that the mean difference in summated ratings between brand A and brand B is somewhere between –2.6501 and –0.4610. tSTAT =

Copyright ©2024 Pearson Education, Inc.


xcviii Chapter 10: Two-Sample Tests 10.21

(a) Paired t Test

Data Hypothesized Mean Difference Level of significance

0 0.05

Intermediate Calculations Sample Size

20

DBar

5.1500

Degrees of Freedom

19

SD

3.0826

Standard Error

0.6893

t Test Statistic

7.4714

Two-Tail Test Lower Critical Value

-2.0930

Upper Critical Value

2.0930

p-Value

0.0000

Reject the null hypothesis

(b)

At the 0.05 level, there is sufficient evidence that there is a significant difference in the mean ratings between TV and Internet services. A paired-samples t test revealed a tSTAT of 7.4714, which was above the upper critical limit, 2.0930. Because tSTAT = 7.4714 or p-value = 0.000, reject H0. The paired samples t test assumes the mean difference scores are normally distributed.

(c)

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xcix

Copyright ©2024 Pearson Education, Inc.


c Chapter 10: Two-Sample Tests 10.21 cont.

(c)

(d)

The differences appear to be left skewed. However, the sample size is very small, which makes it difficult to interpret the histogram for normality. The data also contains one outlier. Removal of this one outlier may lead to a different conclusion. Using the complete dataset, the confidence interval for the mean difference between TV and Internet ratings is S 3.0826 D  t /2 D  5.1500  2.0930  3.7073 n 20 S 3.0826 D  t /2 D  5.1500  2.0930  6.5927 n 20 3.7073  µD  6.5927.

10.22

From PHStat Paired t Test Data Hypothesized Mean Difference Level of significance

0 0.05

Intermediate Calculations Sample Size

25

DBar

3.0988

Degrees of Freedom

24

SD

4.6887

Standard Error

0.9377

t Test Statistic

3.3045 Two-Tail Test

Lower Critical Value

-2.0639

Upper Critical Value

2.0639

p-Value

0.0030

Reject the null hypothesis

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ci Because tSTAT = 3.3045 > 2.0639 or p-value = 0.0030, reject H0. There is evidence to conclude that the mean meal cost is higher at an inexpensive restaurant than at McDonald’s. You must assume that the distribution of the differences between the meal costs is approximately normal.

SD

4.6887  1.1634 n 25 S 4.6887 D  t /2 D  3.0988  2.0639  5.0342 n 25 The confidence interval is from 1.1634 to 5.0342. D  t /2

 3.0988  2.0639

Copyright ©2024 Pearson Education, Inc.


cii Chapter 10: Two-Sample Tests 10.23

From PHStat Paired t Test

Data Hypothesized Mean Difference Level of significance

0 0.05

Intermediate Calculations Sample Size

10

DBar

-0.1000

Degrees of Freedom

9

SD

1.7288

Standard Error

0.5467

t Test Statistic

-0.1829

Two-Tail Test Lower Critical Value

-2.2622

Upper Critical Value

2.2622

p-Value

0.8589

Do not reject the null hypothesis Because tSTAT = –0.1829 or p-value = 0.8589, do not reject H0. There is insufficient evidence to conclude that the mean scores of coffeepot-brewed coffee is has higher scores than K-cup-brewed coffee. 10.24

(a)

Define the difference in bone marrow microvessel density as the density before the transplant minus the density after the transplant and assume that the difference in density is normally distributed.

H 0 : D  0 vs. H1 : D  0

t-Test: Paired Two Sample for Means Mean

Before 312.1429 Copyright ©2024 Pearson Education, Inc.

After 226


Solutions to End-of-Section and Chapter Review Problems ciii Variance Observations Pearson Correlation Hypothesized Mean Difference df t Stat P(T<=t) one-tail t Critical one-tail P(T<=t) two-tail t Critical two-tail

15513.14 7 0.295069 0 6 1.842455 0.057493 1.943181 0.114986 2.446914

4971 7

Test statistic: tSTAT  D   D = 1.8425 SD n

10.24 cont.

(a)

(b)

(c)

(d)

Decision: Since tSTAT =  is less than the critical value of 1.943, do not reject H 0 . There is not enough evidence to conclude that the mean bone marrow microvessel density is higher before the stem cell transplant than after the stem cell transplant. p-value = 0.0575. The probability of obtaining a mean difference in density that gives rise to a t test statistic that deviates from 0 by 1.8425 or more is 0.0575 if the mean density is not higher before the stem cell transplant than after the stem cell transplant. S 123.7005 28.26  D  200.55 D  t D  86.1429  2.4469 n 7 You are 95% confident that the mean difference in bone marrow microvessel density before and after the stem cell transplant is somewhere between –28.26 and 200.55. You must assume that the distribution of differences between the mean density of before and after stem cell transplant is approximately normal.

10.25 Mean Standard Error Median Mode Standard Deviation Sample Variance Kurtosis Skewness Range Minimum Maximum Sum Count First Quartile Third Quartile Interquartile Range 1.33 * Std Dev 5 * Std Dev

Cola A Adindex Cola B (Test Cola) Adindex 18.55263158 21.31578947 0.978937044 0.822086011 18 21 24 21 6.034573222 5.067678519 36.41607397 25.68136558 -0.640865482 -0.294923931 -0.077015645 -0.173917096 24 21 6 9 30 30 705 810 38 38 15 18 24 24 9 6 8.025982385 6.740012431 30.17286611 25.3383926 Copyright ©2024 Pearson Education, Inc.


civ Chapter 10: Two-Sample Tests From the descriptive statistics provided in the Microsoft Excel output there does not seem to be any violation of the assumption of normality. The mean and median are similar and the skewness value is near 0. Without observing other graphical devices such as a stem-and-leaf display, boxplot, or normal probability plot, the fact that the sample size (n = 38) is not very small enables us to assume that the paired t test is appropriate here.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cv 10.25 cont.

At the 0.05 level, there is sufficient evidence that there is a significant difference between the Adindex values for Cola A and Cola B (Test Cola). The tSTAT is –2.57 with a p-value of 0.014. Because tSTAT = –2.57 or p-value = 0.014, reject H0. These findings suggest that the cola video ad is different than the likeability of Cola B 10.26

(a)

Copyright ©2024 Pearson Education, Inc.


cvi Chapter 10: Two-Sample Tests 10.26 cont.

(a)

(b) (c)

10.27

(a)

(b)

10.28

(a)

(b)

H0: D  0 H1: D  0 Decision rule: d.f. = 39. If tSTAT < –2.4258, reject H0. D  D Test statistic: tSTAT  = –9.372 SD n Decision: Since tSTAT = –9.372 is less than the critical bound of –2.4258, reject H0. There is enough evidence to conclude that the mean strength is lower at two days than at seven days. You must assume that the distribution of the differences between the mean strength of the concrete is approximately normal. p-value is virtually 0. The probability of obtaining a mean difference that gives rise to a test statistic that is –9.372 or less when the null hypothesis is true is virtually 0.

X 1 40 X X  X2 25 40  25   0.40, p2  2   0.25, and p  1   0.325 n1 100 n2 100 n1  n2 100  100 H0:  1 =  2 H1:  1   2 Decision rule: If ZSTAT < –1.96 or ZSTAT > 1.96, reject H0.  p1  p2   1   2    0.40  0.25  0 Test statistic: Z STAT  = 2.2646 1 1 1 1   p 1  p     0.325 1  0.325      100 100   n1 n2  Decision: Since ZSTAT = 2.2646 is above the critical bound of 1.96, reject H0. There is sufficient evidence to conclude that the population proportions differ for group 1 and group 2. p1 

 p 1  p1 

 p1  p2   Z  1

n1  0.0218  1   2  0.2782

p2 1  p2    0.4  0.6  0.25  0.75  +   0.15  1.96   n2 100   100 

X 1 45 X X  X 2 45  25 25   0.45, p2  2   0.50, and p  1   0.467 n1 100 n2 50 n1  n2 100  50 H0:  1 =  2 H1:  1   2 Decision rule: If Z < – 2.58 or Z > 2.58, reject H0.  p  p2   1   2    0.45-0.50   0 Z STAT  1 = –0.58 1 1 1 1   p 1  p     0.467 1-0.467      100 50   n1 n2  Decision: Since ZSTAT = –0.58 is between the critical bound of 2.58, do not reject H0. There is insufficient evidence to conclude that the population proportion differs for group 1 and group 2 p1 

 p 1  p1 

 p1  p2   Z  1

n1  0.2727  1   2  0.1727

p2 1  p2    .45 .55 .5 .5  +   0.05  2.5758   n2 50    100

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cvii 10.29

(a)

From PHStat Z Test for Differences in Two Proportions

Data Hypothesized Difference

0

Level of Significance

0.05

Group 1 Number of Items of Interest

883

Sample Size

2741 Group 2

Number of Items of Interest

873

Sample Size

2776

Intermediate Calculations Group 1 Proportion

0.322145202

Group 2 Proportion

0.314481268

Difference in Two Proportions

0.007663934

Average Proportion

0.3183

Z Test Statistic

0.6110

Two-Tail Test Lower Critical Value

-1.9600

Upper Critical Value

1.9600

p-Value

0.5412

Do not reject the null hypothesis Copyright ©2024 Pearson Education, Inc.


cviii Chapter 10: Two-Sample Tests H0:  1 =  2 H1:  1   2 where Populations: 1 = Basic subscribers, 2 = Premium subscribers Decision rule: If ZSTAT < –1.960 or ZSTAT > 1.960, reject H0. Z STAT = 0.6110 Decision: Since ZSTAT = 0.6110 is between the two critical bounds, do not reject H0. There is insufficient evidence of a difference between basic and premium subscribers in the proportion who churn at the 0.05 level of significance. p-value = 0.5412. The probability of obtaining a difference in two sample proportions of 0.007663934 or more in either direction when the null hypothesis is true is 0.5412.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cix 10.29 cont.

(b)

From PHStat Z Test for Differences in Two Proportions

Data Hypothesized Difference Level of Significance

0 0.05

Group 1 Number of Items of Interest

16

Sample Size

50 Group 2

Number of Items of Interest

15

Sample Size

50

Intermediate Calculations Group 1 Proportion

0.32

Group 2 Proportion

0.3

Difference in Two Proportions

0.02

Average Proportion

0.3100

Z Test Statistic

0.2162

Two-Tail Test Lower Critical Value

-1.9600

Upper Critical Value

1.9600

p-Value

0.8288

Do not reject the null hypothesis

Copyright ©2024 Pearson Education, Inc.


cx Chapter 10: Two-Sample Tests H0:  1 =  2 H1:  1   2 where Populations: 1 = Basic subscribers, 2 = Premium subscribers Decision rule: If ZSTAT < –1.960 or ZSTAT > 1.960, reject H0. Z STAT = 0.2162 Decision: Since ZSTAT = 0.2162 is between the two critical bounds, do not reject H0. There is insufficient evidence of a difference between basic and premium subscribers in the proportion who churn at the 0.05 level of significance. p-value = 0.8288. The probability of obtaining a difference in two sample proportions of 0.02 or more in either direction when the null hypothesis is true is 0.8288. (c)

There is no difference in the results despite the smaller sample size in part (b).

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxi 10.30

(a) (b)

H0:  1   2 H1:  1 =  2 Population 1 = Caffeinated, Population 2 = Decaffeinated From PHStat Z Test for Differences in Two Proportions

Data Hypothesized Difference

0

Level of Significance

0.05

Group 1 Number of Items of Interest

40

Sample Size

100 Group 2

Number of Items of Interest

10

Sample Size

100

Intermediate Calculations Group 1 Proportion

0.4

Group 2 Proportion

0.1

Difference in Two Proportions

0.3

Average Proportion

0.2500

Z Test Statistic

4.8990

Upper-Tail Test Upper Critical Value

1.6449

p-Value

0.0000

Reject the null hypothesis Decision rule: If ZSTAT > 1.6449, reject H0. Copyright ©2024 Pearson Education, Inc.


cxii Chapter 10: Two-Sample Tests

(c)

Test statistic:  p  p2   1   2  Z STAT  1 = 4.899 p-value is essentially 0. 1 1 p 1  p      n1 n2  Decision: Since ZSTAT = 4.899 > 1.6449 or p-value 0.0000 < 0.05, reject H0. There is evidence to conclude that the population proportion of those who had caffeine were more likely to do impulse buying. Yes, the result in (b) makes it appropriate to that the population proportion of those who had caffeinated coffee were more likely to do impulse buying than those who did not have caffeinated coffee.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxiii 10.31

(a)

From PHStat Z Test for Differences in Two Proportions

Data Hypothesized Difference

0

Level of Significance

0.05

Group 1 Number of Items of Interest

464

Sample Size

1030

Group 2 Number of Items of Interest

350

Sample Size

1030

Intermediate Calculations Group 1 Proportion

0.450485437

Group 2 Proportion

0.339805825

Difference in Two Proportions

0.110679612

Average Proportion

0.3951

Z Test Statistic

5.1377

Two-Tail Test Lower Critical Value

-1.9600

Upper Critical Value

1.9600

p-Value

0.0000

Reject the null hypothesis H0:  1 =  2 H1:  1   2 Copyright ©2024 Pearson Education, Inc.


cxiv Chapter 10: Two-Sample Tests

(b)

where Populations: 1 = U.S workers, 2 = Canadian workers Decision rule: If ZSTAT < –1.960 or ZSTAT > 1.960, reject H0. Z STAT = 5.1377 Decision: Since ZSTAT = 5.1377 > 1.960, reject H0. There is evidence of a difference between U.S. and Canadian workers in the proportion who indicate that their organization provides explicit training on empathy for all people managers at the 0.05 level of significance. p-value = 0.0000. The probability of obtaining a difference in two sample proportions of 0.110679612 or more in either direction when the null hypothesis is true is 0.0000.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxv 10.32

(a)

From PHStat Z Test for Differences in Two Proportions

Data Hypothesized Difference

0

Level of Significance

0.01

Group 1 Number of Items of Interest

1112

Sample Size

1737

Group 2 Number of Items of Interest

302

Sample Size

642

Intermediate Calculations Group 1 Proportion

0.640184226

Group 2 Proportion

0.470404984

Difference in Two Proportions

0.169779241

Average Proportion

0.5944

Z Test Statistic

7.4862

Two-Tail Test Lower Critical Value

-2.5758

Upper Critical Value

2.5758

p-Value

0.0000

Reject the null hypothesis H0:  1 =  2 H1:  1   2 Copyright ©2024 Pearson Education, Inc.


cxvi Chapter 10: Two-Sample Tests where Populations: 1 = HR Professionals, 2 = U.S. workers Decision rule: If ZSTAT < –2.58 or ZSTAT > 2.58, reject H0. Z STAT = 7.4862 Decision: Since ZSTAT = 7.4862 > 2.58, reject H0. There is evidence of a difference in the proportion of HR professionals and U.S. workers.at the 0.01 level of significance. (b)

p-value = 0.0000. The probability of obtaining a difference in proportions that gives a test statistic below –7.4862 or above 7.4862 is 0.0000 if there is no difference in the proportion based on the two groups.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxvii 10.32 cont.

(c)

From PHStat Confidence Interval Estimate of the Difference Between Two Proportions

Data Confidence Level

99%

Intermediate Calculations Z Value

-2.5758

Std. Error of the Diff. between two Proportions

0.0228

Interval Half Width

0.0588

Confidence Interval Interval Lower Limit

0.1110

Interval Upper Limit

0.2286

0.1110  1   2  0.2286 You are 99% confident that the difference in the proportion based on the groups is between 11.10% and 22.86%.

Copyright ©2024 Pearson Education, Inc.


cxviii Chapter 10: Two-Sample Tests 10.33

(a)

From PHStat, LinkedIn Z Test for Differences in Two Proportions

Data Hypothesized Difference

0

Level of Significance

0.05

Group 1 Number of Items of Interest

1246

Sample Size

1538

Group 2 Number of Items of Interest

1514

Sample Size

2857

Intermediate Calculations Group 1 Proportion

0.810143043

Group 2 Proportion

0.529926496

Difference in Two Proportions

0.280216547

Average Proportion

0.6280

Z Test Statistic

18.3313

Two-Tail Test Lower Critical Value

-1.9600

Upper Critical Value

1.9600

p-Value

0.0000

Reject the null hypothesis H0:  1 =  2 H1:  1   2 Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxix where Populations: 1 = B2B marketers, 2 = B2C marketers Decision rule: If ZSTAT < –1.96 or ZSTAT > 1.96, reject H0. Z STAT = 18.3313 Decision: Since ZSTAT = 18.3313 > 1.96, reject H0. There is evidence of a difference in the proportion of B2B and B2C marketers who use LinkedIn as a social media tool.at the 0.05 level of significance. (b)

p-value = 0.0000. The probability of obtaining a difference in proportions that gives a test statistic below –18.3313 or above 18.3313 is 0.0000 if there is no difference in the proportion based on the two groups.

Copyright ©2024 Pearson Education, Inc.


cxx Chapter 10: Two-Sample Tests 10.33 cont.

(c)

From PHStat, YouTube Z Test for Differences in Two Proportions

Data Hypothesized Difference

0

Level of Significance

0.05

Group 1 Number of Items of Interest

877

Sample Size

1539 Group 2

Number of Items of Interest

1571

Sample Size

2856

Intermediate Calculations Group 1 Proportion

0.569850552

Group 2 Proportion

0.550070028

Difference in Two Proportions

0.019780524

Average Proportion

0.5570

Z Test Statistic

1.2593

Two-Tail Test Lower Critical Value

-1.9600

Upper Critical Value

1.9600

p-Value

0.2079

Do not reject the null hypothesis Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxxi H0:  1 =  2 H1:  1   2 where Populations: 1 = B2B marketers, 2 = B2C marketers Decision rule: If ZSTAT < –1.96 or ZSTAT > 1.96, reject H0. Z STAT = 1.2593 Decision: Since ZSTAT = 1.2593 < 1.96, do not reject H0. There is insufficient evidence of a difference in the proportion of B2B and B2C marketers who use YouTube as a social media tool.at the 0.05 level of significance. (d)

p-value = 0.2079. The probability of obtaining a difference in proportions that gives a test statistic below –1.2593 or above 1.2593 is 0.2079 if there is no difference in the proportion based on the two groups.

Copyright ©2024 Pearson Education, Inc.


cxxii Chapter 10: Two-Sample Tests 10.34

(a)

H0:  1 =  2 H1:  1   2 Where Population: 1 = business leaders, 2 = knowledge workers Z Test for Differences in Two Proportions

Data Hypothesized Difference

0

Level of Significance

0.05

Group 1 Number of Items of Interest

213

Sample Size

251

Group 2 Number of Items of Interest

161

Sample Size

251

Intermediate Calculations Group 1 Proportion

0.848605578

Group 2 Proportion

0.641434263

Difference in Two Proportions

0.207171315

Average Proportion

0.7450

Z Test Statistic

5.3249

Two-Tail Test Lower Critical Value

-1.9600

Upper Critical Value

1.9600

p-Value

0.0000

Reject the null hypothesis Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxxiii Decision rule: If ZSTAT < –1.96 or ZSTAT > 1.96, reject H0. Z STAT = 5.3249 Decision: Since ZSTAT = 5.3249 > 1.96, reject H0. There is evidence of a difference in the proportion business leaders and knowledge workers that indicate their company exemplifies effective communication.at the 0.05 level of significance. (b)

p-value = 0.0000. The probability of obtaining a difference in proportions that is 0.2071 or more in either direction is 0.0000 if there is no difference between the proportion of business leaders and knowledge workers that indicate their company exemplifies effective communication.

Copyright ©2024 Pearson Education, Inc.


cxxiv Chapter 10: Two-Sample Tests 10.35

(a)

H0:  1 =  2 H1:  1   2 Where Population: 1 = Northeast region, 2 = Midwest region Z Test for Differences in Two Proportions

Data Hypothesized Difference

0

Level of Significance

0.05

Group 1 Number of Items of Interest

63

Sample Size

196

Group 2 Number of Items of Interest

44

Sample Size

208

Intermediate Calculations Group 1 Proportion

0.321428571

Group 2 Proportion

0.211538462

Difference in Two Proportions

0.10989011

Average Proportion

0.2649

Z Test Statistic

2.5017

Two-Tail Test Lower Critical Value

-1.9600

Upper Critical Value

1.9600

p-Value

0.0124

Reject the null hypothesis Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxxv Decision rule: If ZSTAT < –1.96 or ZSTAT > 1.96, reject H0. Z STAT = 2.5017 Decision: Since ZSTAT = 2.5017 > 1.96, reject H0. There is evidence of a difference in the proportion Northeast and Midwest regions that indicate a preference for open-air shopping.at the 0.05 level of significance. (b)

p-value = 0.0124. The probability of obtaining a difference in proportions that is 0.10989 or more in either direction is 0.0124 if there is no difference between the proportion of Northeast and Midwest regions that indicate a preference for open-air shopping.

Copyright ©2024 Pearson Education, Inc.


cxxvi Chapter 10: Two-Sample Tests 10.35 cont.

(c)

From PHStat Confidence Interval Estimate of the Difference Between Two Proportions

Data Confidence Level

95%

Intermediate Calculations Z Value

-1.9600

Std. Error of the Diff. between two Proportions

0.0438

Interval Half Width

0.0858

Confidence Interval Interval Lower Limit

0.0241

Interval Upper Limit

0.1957

You are 95% confident that the difference in the proportion based on the groups is between 2.41% and 19.57%. 10.36

(a) (b) (c)

2.20 2.57 3.50

10.37

(a) (b)

 = 0.05, n1 = 16, n2 = 20, F0.05/ 2 = 2.62  = 0.01, n1 = 16, n2 = 20, F0.01/ 2 = 3.59

10.38

(a) (b)

Population B: S 2  25 1.5625

10.39

FSTAT 

S12 155.3   1.2152 S2 2 127.8

10.40

df numerator  24, df denominator  24

10.41

 = 0.05, n1 = 25, n2 = 25, F0.05/ 2 = 2.27 Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxxvii

10.42

Because FSTAT  1.2152  2.27, do not reject H0.

10.43

The F test for the ratio of two variances is sensitive to departures from normality. The F test should not be used because both populations are skewed and do not meet the assumption of normality. The Levene test or a nonparametric test should be used in this situation.

10.44

(a) (b)

10.45

(a)

S12 45.6   1.1783  3.67, do not reject H0. S22 38.7 Because FSTAT  1.1783  2.945, do not reject H0. Because FSTAT 

From PHStat, Larger-variance sample: Traditional. Smaller: PrePaid H0 : 12   22 . H1: 12   22 . F Test for Differences in Two Variances

Data Level of Significance

0.05

Larger-Variance Sample Sample Size

7

Sample Variance

56.80952381

Smaller-Variance Sample Sample Size

12

Sample Variance

15.35606061

Intermediate Calculations F Test Statistic

3.6995

Population 1 Sample Degrees of Freedom

6

Population 2 Sample Degrees of Freedom

11

Two-Tail Test Copyright ©2024 Pearson Education, Inc.


cxxviii Chapter 10: Two-Sample Tests Upper Critical Value

3.8807

p-Value

0.0583

Do not reject the null hypothesis Decision rule: If FSTAT > 3.6995, reject H0. At the 0.05 significance level, there is no evidence that there is a difference between the variances of the two types of cell providers. The FSTAT of 3.6995 is below the upper critical value of 3.8807. Because FSTAT = 3.6995 or p-value = 0.0583, do not reject H0. (b)

The p-value = 0.0583, which means that the probability of obtaining an equal or larger value than the observed test statistic, when H0 is true, is 5.83%.

(c)

To justify the use of the F test, it is assumed that the rating data from both groups are normally distributed.

(d)

Because the results from (a) and (b) revealed that there was no significant difference between the two types of cellular providers, one would use the pooled-variance t test.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxxix 10.46

(a)

Copyright ©2024 Pearson Education, Inc.


cxxx Chapter 10: Two-Sample Tests

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxxxi 10.46 cont.

(a)

(b) (c) (d)

10.47

At the 0.05 level, there is no evidence that there is a difference between the variances of the Southeast and Gulf Coast regions. One would fail to reject the null hypothesis. The FSTAT of 1.44 is below the upper critical value. Because FSTAT = 1.44 or p-value = 0.410, do not reject H0. The p-value is 0.410, which means that the probability of obtaining an equal or larger value than the observed test statistic, when H0 is true, is 41%. To justify the use of the F test, it was assumed that the rating data from both populations were normally distributed. Because the results from (a) and (b) revealed that there was no significant difference between the variances of the Southeast and Gulf Coast regions, one would use the pooled-variance t test.

(a)

Copyright ©2024 Pearson Education, Inc.


cxxxii Chapter 10: Two-Sample Tests 10.47 cont.

(a)

(b)

(c)

At the 0.05 level, there is no evidence that there is a difference between the variances in waiting times at the two bank branches. One would fail to reject the null hypothesis. The FSTAT of 1.62 is below the upper critical value. Because FSTAT = 1.62 or p-value = 0.380, do not reject H0. The p-value is 0.380, which means that the probability of obtaining an equal or larger value than the observed test statistic, when H0 is true, is 38.0 %. The p-value is above 0.05, which indicates that the observed difference is not significant. To justify the use of the F test, it was assumed that the rating data from both groups were normally distributed.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxxxiii 10.47 cont.

(c)

(d)

Waiting times for Bank1 appear to be skewed to left while times for Bank2 are slightly skewed to the right. Because the F test for the ratio of two variances is sensitive to the normality assumption, other tests to assess for differences between two variances should be considered. Because the results from (a) revealed that there was no significant difference between the two bank branches, one would use the pooled-variance t test.

Copyright ©2024 Pearson Education, Inc.


cxxxiv Chapter 10: Two-Sample Tests 10.48

From PHStat, Larger-variance sample: Halftime or after. Smaller: Before halftime H0 : 12   22 . H1: 12   22 . F Test for Differences in Two Variances

Data Level of Significance

0.05

Larger-Variance Sample Sample Size

29

Sample Variance

0.503054187

Smaller-Variance Sample Sample Size

28

Sample Variance

0.492843915

Intermediate Calculations F Test Statistic

1.0207

Population 1 Sample Degrees of Freedom

28

Population 2 Sample Degrees of Freedom

27

Two-Tail Test Upper Critical Value

2.1512

p-Value

0.9594

Do not reject the null hypothesis

(a)

(b)

At the 0.05 level, there is insufficient evidence that there is a difference between the variability in the rating scores between ads in first half and ads in the second half. The FSTAT of 1.0207 is below the upper critical value. Because FSTAT = 1.0207 < 2.1512 or p-value = 0.9594 < 0.05, do not reject H0. The p-value is 0.9594, which means that the probability of obtaining an equal or larger value than the observed test statistic, 1.0207, when H0 is true, is 95.98%, if there is no difference in the two population variances. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxxxv (c) (d)

To justify the use of the F test, it was assumed that the rating data from both groups were normally distributed. Based on (a) and (b), a pooled-variance t test should be used.

Copyright ©2024 Pearson Education, Inc.


cxxxvi Chapter 10: Two-Sample Tests 10.49

10.50

From PHStat H0 : 12   22 . H1: 12   22 .

(a)

At the 0.05 level, there is evidence that there is a difference between the variances in the time spent accessing the Internet between the iOS users and Android users. The FSTAT of 12.4665 is well above the critical value. Because FSTAT = 12.4665 > 2.1010 or p-value = 0.0000 < 0.05, reject H0.

(b)

On the basis of the results in (a), the separate-variance t test would be the appropriate choice for these data.

(a)

Because FSTAT = 69.50001 > 1.9811 or p-value = 0.0000 < 0.05, reject H0. There is evidence of a difference in the variance of the delay times between the two drivers. You assume that the delay times are normally distributed. From the boxplots and the normal probability plots, the delay times appear to be approximately normally distributed. Because there is a difference in the variance of the delay times between the two drivers, you should use the separate-variance t test to determine whether there is evidence of a difference in the mean delay time between the two drivers.

(b) (c) (d)

10.51

Among the criteria to be used in selecting a particular hypothesis test are the type of data, whether the samples are independent or paired, whether the test involves central tendency or variation, whether the assumption of normality is valid, and whether the variances in the two populations are equal.

10.52

The pooled-variance t test should be used when the populations are approximately normally distributed and the variances of the two populations are assumed equal. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxxxvii 10.53 10.54

The F test can be used to examine differences in two variances when each of the two populations is assumed to be normally distributed. With independent populations, the outcomes in one population do not depend on the outcomes in the second population. With two related populations, either repeated measurements are obtained on the same set of items or individuals, or items or individuals are paired or matched according to some characteristic.

10.55

Repeated measurements represent two measurements on the same items or individuals, while paired measurements involve matching items according to a characteristic of interest.

10.56

The hypothesis test for the difference between two means provides a single test statistic upon which a decision is made to reject or fail to reject the hypothesis. The confidence interval estimate provides the low end and high end of the mean differences assuming a given confidence level such as 95%. The confidence interval estimate can also be used to decide whether to accept or reject the null hypothesis. If the hypothesized value of 0 for the difference in two population means is not in the confidence interval, then, assuming a two-tailed test is used, the null hypothesis of no difference in the two population means can be rejected.

10.57

When you have paired data or data obtained from repeated measurements.

10.58

(a)

From PHStat, Black Belt variance = (30,000)2 = 900,000,000 Green Belt variance = (25,000)2 = 625,000,000 H0 : 12   22 . H1: 12   22 . F Test for Differences in Two Variances

Data Level of Significance

0.05

Larger-Variance Sample Sample Size

47

Sample Variance

900000000

Smaller-Variance Sample Sample Size

56

Sample Variance

625000000

Intermediate Calculations F Test Statistic

1.4400

Copyright ©2024 Pearson Education, Inc.


cxxxviii Chapter 10: Two-Sample Tests Population 1 Sample Degrees of Freedom

46

Population 2 Sample Degrees of Freedom

55

Two-Tail Test Upper Critical Value

1.7387

p-Value

0.1950

Do not reject the null hypothesis Because FSTAT = 1.44 < 1.7387 or p-value = 0.1950, do not reject H0. There is insufficient evidence of a difference in the variance of the salary of Black Belts and Green Belts. (b) 10.58 cont.

(c)

Based on the results from (a), one would not reject the null hypothesis and choose the pooled-variance t test. From PHStat Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference Level of Significance

0 0.05

Population 1 Sample Sample Size

47

Sample Mean

126551

Sample Standard Deviation

30000

Population 2 Sample Sample Size

56

Sample Mean

95261

Sample Standard Deviation

25000

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxxxix Intermediate Calculations Population 1 Sample Degrees of Freedom

46

Population 2 Sample Degrees of Freedom

55

Total Degrees of Freedom

101

Pooled Variance

750247524.7525

Standard Error

5418.4860

Difference in Sample Means

31290.0000

t Test Statistic

5.7747

Upper-Tail Test Upper Critical Value

1.6601

p-Value

0.0000 Reject the null hypothesis

At the 0.05 level, there is evidence that the mean salary for Black Belt jobs is significantly higher than the mean salary for Green Belt jobs. The tSTAT of 5.7747 is above the critical value. The p-value is 0.000. Because tSTAT = 5.7747 > 1.6601 or p-value = 0.000 < 0.05, reject H0.

Copyright ©2024 Pearson Education, Inc.


cxl Chapter 10: Two-Sample Tests 10.59

(a)

From PHStat, Private and Public H0 : 12   22 . H1: 12   22 . F Test for Differences in Two Variances

Data Level of Significance

0.05

Larger-Variance Sample Sample Size

226

Sample Variance

51703673.46

Smaller-Variance Sample Sample Size

266

Sample Variance

5310845.239

Intermediate Calculations F Test Statistic

9.7355

Population 1 Sample Degrees of Freedom

225

Population 2 Sample Degrees of Freedom

265

Two-Tail Test Upper Critical Value

1.2847

p-Value

0.0000 Reject the null hypothesis

(b)

At the 0.05 level, there is sufficient evidence that the difference between the variances of the debt amount of private and public colleges is significantly different. The FSTAT of 9.7355 is above the upper critical value. Because FSTAT = 9.7355 > 1.2847 or p-value = 0.0000 < 0.05, reject H0. Based on the results from (a), one would choose the separate-variance t test because the variances between the two groups differed significantly at the 0.05 level. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxli 10.59 cont.

(c)

From PHStat, Population 1 = Private Population 2 = Public Separate-Variances t Test for the Difference Between Two Means (assumes unequal population variances) Data Hypothesized Difference

0

Level of Significance

0.05

Population 1 Sample Sample Size

266

Sample Mean

8400.590226

Sample Standard Deviation

2304.5271

Population 2 Sample Sample Size

226

Sample Mean

7587.349558

Sample Standard Deviation

7190.5266

Intermediate Calculations Numerator of Degrees of Freedom

61873030209.5447

Denominator of Degrees of Freedom

234122289.7854

Total Degrees of Freedom Degrees of Freedom

264.2765 264

Standard Error

498.7413

Difference in Sample Means

813.2407

Separate-Variance t Test Statistic

Two-Tail Test Copyright ©2024 Pearson Education, Inc.

1.6306


cxlii Chapter 10: Two-Sample Tests Lower Critical Value

-1.9690

Upper Critical Value

1.9690

p-Value

0.1042

Do not reject the null hypothesis (d)

A separate-sample t test revealed that the average loan debt for students attending private colleges was significantly higher than students attending public colleges. The tSTAT of 1.6306 is between the critical values. Because tSTAT = 1.6306 < 1.969 or p-value = 0.1042 > 0.05, do not reject H0. One would conclude that there is not a significant difference between private and public colleges in the mean amount of student loan debt incurred.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxliii 10.60

(a)

From PHStat, Boys variance = (45)2 = 2025 Girls variance = (40)2 = 1600 H0 : 12   22 . H1: 12   22 . F Test for Differences in Two Variances

Data Level of Significance

0.01

Larger-Variance Sample Sample Size

100

Sample Variance

2025

Smaller-Variance Sample Sample Size

100

Sample Variance

1600

Intermediate Calculations F Test Statistic

1.2656

Population 1 Sample Degrees of Freedom

99

Population 2 Sample Degrees of Freedom

99

Two-Tail Test Upper Critical Value

1.6854

p-Value

0.2430

Do not reject the null hypothesis Using a 0.01 level of significance, there is insufficient evidence that there is a difference in the variances in time spent online between boys and girls. Because FSTAT = 1.2656 < 1.6854 or p-value = 0.2430 > 0.05, do not reject H0.

Copyright ©2024 Pearson Education, Inc.


cxliv Chapter 10: Two-Sample Tests 10.60 cont.

(b)

It is most appropriate to use the pooled-variance t test to test for differences in mean online time between boys and girls. Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference

0

Level of Significance

0.01

Population 1 Sample Sample Size

100

Sample Mean

556

Sample Standard Deviation

45

Population 2 Sample Sample Size

100

Sample Mean

482

Sample Standard Deviation

40

Intermediate Calculations Population 1 Sample Degrees of Freedom

99

Population 2 Sample Degrees of Freedom

99

Total Degrees of Freedom

198

Pooled Variance

1812.5000

Standard Error

6.0208

Difference in Sample Means

74.0000

t Test Statistic

12.2907

Two-Tail Test Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxlv Lower Critical Value

-2.6009

Upper Critical Value

2.6009

p-Value

0.0000 Reject the null hypothesis

At the 0.01 significance level, the tSTAT of 12.2907 is well above the upper critical value. Because tSTAT = 12.2907 > 2.6009 or p-value = 0.0000 < 0.01, one would reject the null hypothesis that there is no difference between the mean entertainment screen use time per day between boys and girls. There is evidence in the mean entertainment screen use per day of boys and girls.

Copyright ©2024 Pearson Education, Inc.


cxlvi Chapter 10: Two-Sample Tests 10.61

Because the F test for the ratio of two variances revealed a significant difference between city and outlying restaurants on the cost variable, a separate-variance t test was used for this variable. Because no significant differences were observed at the 0.05 level for the food, décor, and service ratings, a pooled-variance t test will be used for these variables. From PHStat, Cost, Location: Center City, Metro Area Separate-Variances t Test for the Difference Between Two Means (assumes unequal population variances) Data Hypothesized Difference

0

Level of Significance

0.05

Population 1 Sample Sample Size

50

Sample Mean

60.68

Sample Standard Deviation

23.5973

Population 2 Sample Sample Size

50

Sample Mean

46.18

Sample Standard Deviation

14.7284

Intermediate Calculations Numerator of Degrees of Freedom

239.4821

Denominator of Degrees of Freedom

2.9153

Total Degrees of Freedom

82.1473

Degrees of Freedom

82

Standard Error

3.9339

Difference in Sample Means

14.5000

Separate-Variance t Test Statistic

3.6860

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxlvii

Two-Tail Test Lower Critical Value

-1.9893

Upper Critical Value

1.9893

p-Value

0.0004 Reject the null hypothesis

The mean cost rating is significantly higher for Center City restaurants compared to Metro Area restaurants. Because tSTAT = 3.686 or p-value = 0.0004, one would reject the null hypothesis that the cost ratings were the same for the Center City and Metro Area restaurants.

Copyright ©2024 Pearson Education, Inc.


cxlviii Chapter 10: Two-Sample Tests 10.61 cont.

From PHStat, Food, Location: Center City, Metro Area Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference

0

Level of Significance

0.05

Population 1 Sample Sample Size

50

Sample Mean

22.88

Sample Standard Deviation

2.412890552

Population 2 Sample Sample Size

50

Sample Mean

24.32

Sample Standard Deviation

2.132738012

Intermediate Calculations Population 1 Sample Degrees of Freedom

49

Population 2 Sample Degrees of Freedom

49

Total Degrees of Freedom

98

Pooled Variance

5.1853

Standard Error

0.4554

Difference in Sample Means

-1.4400

t Test Statistic

-3.1619

Two-Tail Test

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxlix Lower Critical Value

-1.9845

Upper Critical Value

1.9845

p-Value

0.0021 Reject the null hypothesis

The mean food rating is significantly higher for Metro Area restaurants compared to Center City restaurants. Because tSTAT = –3.1619 or p-value = 0.0021, one would reject the null hypothesis that the cost ratings were the same for the Center City and Metro Area restaurants.

Copyright ©2024 Pearson Education, Inc.


cl Chapter 10: Two-Sample Tests 10.61 cont.

From PHStat, Decor, Location: Center City, Metro Area Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference

0

Level of Significance

0.05

Population 1 Sample Sample Size

50

Sample Mean

19.02

Sample Standard Deviation

4.515234982

Population 2 Sample Sample Size

50

Sample Mean

18.76

Sample Standard Deviation

3.825585189

Intermediate Calculations Population 1 Sample Degrees of Freedom

49

Population 2 Sample Degrees of Freedom

49

Total Degrees of Freedom

98

Pooled Variance

17.5112

Standard Error

0.8369

Difference in Sample Means

0.2600

t Test Statistic

0.3107

Two-Tail Test

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cli Lower Critical Value

-1.9845

Upper Critical Value

1.9845

p-Value

0.7567

Do not reject the null hypothesis There is no significant difference in mean decor rating for Center City restaurants compared to Metro Area restaurants. Because tSTAT = 0.3107 or p-value = 0.7567, one would not reject the null hypothesis that the mean decor ratings were the same for the Center City and Metro Area restaurants.

Copyright ©2024 Pearson Education, Inc.


clii Chapter 10: Two-Sample Tests 10.61 cont.

From PHStat, Service, Location: Center City, Metro Area Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference

0

Level of Significance

0.05

Population 1 Sample Sample Size

50

Sample Mean

20.62

Sample Standard Deviation

3.421957907

Population 2 Sample Sample Size

50

Sample Mean

21.34

Sample Standard Deviation

2.599921506

Intermediate Calculations Population 1 Sample Degrees of Freedom

49

Population 2 Sample Degrees of Freedom

49

Total Degrees of Freedom

98

Pooled Variance

9.2347

Standard Error

0.6078

Difference in Sample Means

-0.7200

t Test Statistic

-1.1847

Two-Tail Test

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cliii Lower Critical Value

-1.9845

Upper Critical Value

1.9845

p-Value

0.2390

Do not reject the null hypothesis There is no significant difference in mean service rating for Center City restaurants compared to Metro Area restaurants. Because tSTAT = –1.1847 or p-value = 0.2390, one would not reject the null hypothesis that the mean service ratings were the same for the Center City and Metro Area restaurants.

Copyright ©2024 Pearson Education, Inc.


cliv Chapter 10: Two-Sample Tests 10.62

(a)

H0:   10 minutes. Introductory computer students required no more than a mean of 10 minutes to write and run a program in Python. H1:  > 10 minutes. Introductory computer students required more than a mean of 10 minutes to write and run a program in Python. Decision rule: d.f. = 8. If tSTAT > 1.8595, reject H0. X –  12 –10  Test statistic: tSTAT  = 3.3282 S 1.8028 n 9 Decision: Since tSTAT = 3.3282 is greater than the critical bound of 1.8595, reject H0. There is enough evidence to conclude that the introductory computer students required more than a mean of 10 minutes to write and run a program in Python.

(b)

(c)

(d)

H0:   10 minutes. Introductory computer students required no more than a mean of 10 minutes to write and run a program in Python. H1:  > 10 minutes. Introductory computer students required more than a mean of 10 minutes to write and run a program in Python. Decision rule: d.f. = 8. If tSTAT > 1.8595, reject H0. X –  16 – 10  Test statistic: tSTAT  = 1.3636 S 13.2004 n 9 Decision: Since tSTAT = 1.3636 is less than the critical bound of 1.8595, do not reject H0. There is not enough evidence to conclude that the introductory computer students required more than a mean of 10 minutes to write and run a program in Python. Although the mean time necessary to complete the assignment increased from 12 to 16 minutes as a result of the increase in one data value, the standard deviation went from 1.8 to 13.2, which in turn brought the t-value down because of the increased denominator. H0:  IC 2   CS 2 H1:  IC 2   CS 2 Decision rule: If FSTAT > 3.8549, reject H0. S 2 2.02 Test statistic: FSTAT  CS 2  = 1.2307 SIC 1.80282 Decision: Since FSTAT = 1.2307 is lower than the critical bound 3.8549, do not reject H0. There is not enough evidence to conclude that the population variances are different for the Introduction to Computers students and computer majors. Hence, the pooledvariance t test is a valid test to see whether computer majors can write a Python program (on average) in less time than introductory students, assuming that the distributions of the time needed to write a Python program for both the Introduction to Computers students and the computer majors are approximately normal. H0:  IC  CS The mean amount of time needed by Introduction to Computers students is not greater than the mean amount of time needed by computer majors. H1: IC  CS The mean amount of time needed by Introduction to Computers students is greater than the mean amount of time needed by computer majors.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems clv 10.62 cont.

(d) Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference Level of Significance

0 0.05

Population 1 Sample Sample Size

9

Sample Mean

12

Sample Standard Deviation

1.802776

Population 2 Sample Sample Size

11

Sample Mean

8.5

Sample Standard Deviation

2

Intermediate Calculations Population 1 Sample Degrees of Freedom

8

Population 2 Sample Degrees of Freedom

10

Total Degrees of Freedom

18

Pooled-Variance

3.666667

Difference in Sample Means t Test Statistic

3.5 4.066633

Upper-Tail Test Upper Critical Value

1.734064

p-Value

0.000362 Copyright ©2024 Pearson Education, Inc.


clvi Chapter 10: Two-Sample Tests Reject the null hypothesis

Decision rule: d.f. = 18. If tSTAT > 1.7341, reject H0. Test statistic: (n  1)  S IC 2  (nCS  1)  SCS 2 9 1.80282  11 2.02 = 3.6667 S p 2  IC  (nIC  1)  (nCS  1) 8  10

tSTAT 

 X  X        IC

CS

IC

CS

12.0  8.5

= 4.0666  1 1 1  1  3.6667    Sp     9 11  n n CS   IC Decision: Since tSTAT = 4.0666 is greater than 1.7341, reject H0. There is enough evidence to support a conclusion that the mean time is higher for Introduction to Computers students than for computer majors. p-value = 0.0052. If the true population mean amount of time needed for Introduction to Computer students to write a Python program is indeed no more than 10 minutes, the probability for observing a sample mean greater than the 12 minutes in the current sample is 0.0052, which means it will be a quite unlikely event. Hence, at a 95% level of confidence, you can conclude that the population mean amount of time needed for Introduction to Computer students to write a Python program is more than 10 minutes. 2

(e)

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems clvii 10.62 cont.

(e)

10.63

(a)

As illustrated in part (d) in which there is not enough evidence to conclude that the population variances are different for the Introduction to Computers students and computer majors, the pooled-variance t test performed is a valid test to determine whether computer majors can write a Python program in less time than in introductory students, assuming that the distributions of the time needed to write a Python program for both the Introduction to Computers students and the computer majors are approximately normal.

F Test for Differences in Two Variances

Data Level of Significance

0.05

Larger-Variance Sample Sample Size

232

Sample Variance

1600

Smaller-Variance Sample Sample Size

257

Sample Variance

400

Intermediate Calculations F Test Statistic

4.0000

Population 1 Sample Degrees of Freedom

231

Population 2 Sample Degrees of Freedom

256

Two-Tail Test Upper Critical Value

1.2857

p-Value

0.0000 Reject the null hypothesis

Copyright ©2024 Pearson Education, Inc.


clviii Chapter 10: Two-Sample Tests An F test for the ratio of two variances revealed a significant difference between the variances of the consumers with elementary children and the consumers with middle school children. This difference was significant at the 0.05 significance level.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems clix 10.63 cont.

(b)

Population 1 = Middle School, Population 2 = Elementary School Separate-Variances t Test for the Difference Between Two Means (assumes unequal population variances) Data Hypothesized Difference Level of Significance

0 0.05

Population 1 Sample Sample Size

232

Sample Mean

345.9

Sample Standard Deviation

40.0000

Population 2 Sample Sample Size

257

Sample Mean

318.4

Sample Standard Deviation

20.0000

Intermediate Calculations Numerator of Degrees of Freedom

71.4527

Denominator of Degrees of Freedom

0.2154

Total Degrees of Freedom Degrees of Freedom

331.7818 331

Standard Error

2.9074

Difference in Sample Means

27.5000

Separate-Variance t Test Statistic

9.4586

Two-Tail Test

Copyright ©2024 Pearson Education, Inc.


clx Chapter 10: Two-Sample Tests Lower Critical Value

-1.9672

Upper Critical Value

1.9672

p-Value

0.0000 Reject the null hypothesis

(c)

10.64

A separate-variance t test revealed a significance difference at the 0.05 level between the mean amount spent by consumers with elementary children and the consumers with middle school children. Consumers with middle school children spent significantly more than consumers with elementary school children. Because tSTAT = 9.4586 or p-value = 0.000, one would reject the null hypothesis that there is no significance difference in mean spent between consumers with elementary children and the consumers with middle school children. The confidence interval for the mean difference in amount spent is 21.9604  1  2  33.0396.

Because an F test for the ratio of two variances revealed no significant differences between the variances of the two manufacturers, a pooled-variance t test is appropriate for these data.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems clxi 10.64 cont.

Copyright ©2024 Pearson Education, Inc.


clxii Chapter 10: Two-Sample Tests

10.64 cont.

The mean length of life was significantly longer for Manufacturer 2 compared to Manufacturer 1. Because tSTAT = –5.08 or p-value = 0.000, one would reject the null hypothesis that the mean length of bulb life is the same for the two manufacturers.

10.65

Population 1 = Wing A, 2 = Wing B H0: 12   22 The population variances are the same. H1: 12   22 The population variances are different. Decision rule: If FSTAT > 2.5265, reject H0. 2 S 2 1.4172  Test statistic: FSTAT  1 2  = 1.0701 2 S2 1.3700  Decision: Since FSTAT = 1.0701 is lower than the critical bound of F / 2 = 2.5265, do not reject H0. There is not enough evidence to conclude that there is a difference between the variances in Wing A and Wing B. Hence, a pooled-variance t test is more appropriate for determining whether there is a difference in the mean delivery time in the two wings of the hotel. H0: 1  2 H1: 1  2 Decision rule: d.f. = 38. If tSTAT < – 2.0244 or tSTAT > 2.0244, reject H0. Test statistic:  n  1 S12   n2  1 S22  20  11.3700 2   20  11.4172 2 S p2  1 = = 1.9427  n1  1   n2  1  20  1   20  1

tSTAT 

 X  X        = 10.40-8.12  0 = 5.1615 1

2

1

1 1 S p2     n1 n2 

2

1   1 1.9427     20 20 

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems clxiii Decision: Since tSTAT = 5.1615 is greater than the upper critical bound of 2.0244, reject H0. There is enough evidence of a difference in the mean delivery time in the two wings of the hotel.

Copyright ©2024 Pearson Education, Inc.


clxiv Chapter 10: Two-Sample Tests 10.66

H0:  1 =  2 H1:  1   2 where Populations: 1 = Males, 2 = Females Decision rule: If p-value < 0.05, reject H0. Gender: Z Test for Differences in Two Proportions Data Hypothesized Difference Level of Significance Group 1 Number of Items of Interest Sample Size Group 2 Number of Items of Interest Sample Size

0 0.05 50 300 96 330

Intermediate Calculations Group 1 Proportion 0.166666667 Group 2 Proportion 0.290909091 Difference in Two Proportions -0.12424242 Average Proportion 0.2317 Z Test Statistic -3.6911 Two-Tail Test Lower Critical Value Upper Critical Value p -Value Reject the null hypothesis

-1.9600 1.9600 0.0002

Decision: Since the p-value is smaller than 0.05, reject H0. There is enough evidence of a difference between males and females in the proportion who order dessert.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems clxv 10.66 cont.

Beef Entrée: Z Test for Differences in Two Proportions Data Hypothesized Difference Level of Significance Group 1 Number of Items of Interest Sample Size Group 2 Number of Items of Interest Sample Size

0 0.05 74 197 68 433

Intermediate Calculations Group 1 Proportion 0.375634518 Group 2 Proportion 0.15704388 Difference in Two Proportions 0.218590638 Average Proportion 0.2254 Z Test Statistic 6.0873 Two-Tail Test Lower Critical Value Upper Critical Value p -Value Reject the null hypothesis

-1.9600 1.9600 0.0000

Decision: Since the p-value = 0.0000 is smaller than 0.05, reject H0. There is enough evidence of a difference in the proportion who order dessert based on whether a beef entrée has been ordered. 10.67

Normal Probability Plot 3900 3850

Vermont

3800 3750 3700 3650 3600 3550 -4

-3

-2

-1

0

1

2

3

Z Value

Copyright ©2024 Pearson Education, Inc.

4


clxvi Chapter 10: Two-Sample Tests 10.67 cont.

Normal Probability Plot 3300 3250

Boston

3200 3150 3100 3050 3000 -4

-3

-2

-1

0

1

2

3

4

Z Value

Because the normal probability plots suggest that the two populations are not normally distributed an F test is inappropriate for testing the difference in two variances. The sample variances for Boston and Vermont shingles are 1204.992 and 2185.032, respectively. It appears that a separate-variance t test is more appropriate for testing the difference in means.

H 0 : B  V Mean weights of Boston and Vermont shingles are the same. H1 :  B  V Mean weights of Boston and Vermont shingles are different.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems clxvii 10.67 cont.

Since the p-value is essentially zero, reject H 0 . There is sufficient evidence to conclude that the mean weights of Boston and Vermont shingles are different.

10.68

The normal probability plots suggest that the two populations are not normally distributed. An F test is inappropriate for testing the difference in the two variances. The sample variances for Boston and Vermont shingles are 0.0203 and 0.015, respectively. Because tSTAT = 3.015 > 1.967 or p-value = 0.0028 <  = 0.05, reject H0. There is sufficient evidence to conclude that there is a difference in the mean granule loss of Boston and Vermont shingles.

10.69

Because an F test for the ratio of two variances revealed an insignificant difference at the 0.05 level between the variances of the two types of smartphone batteries, a pooled-variance t test is appropriate for these data. Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference

0

Level of Significance

0.05

Population 1 Sample Sample Size

30

Sample Mean

30

Sample Standard Deviation

43.2

Population 2 Sample Sample Size

30

Sample Mean

35

Sample Standard Deviation

34.2

Intermediate Calculations Population 1 Sample Degrees of Freedom

29

Population 2 Sample Degrees of Freedom

29

Total Degrees of Freedom

58

Pooled Variance

1517.9400 Copyright ©2024 Pearson Education, Inc.


clxviii Chapter 10: Two-Sample Tests Standard Error

10.0596

Difference in Sample Means

-5.0000

t Test Statistic

-0.4970 Two-Tail Test

Lower Critical Value

-2.0017

Upper Critical Value

2.0017

p-Value

0.6210

Do not reject the null hypothesis A pooled-variance t test revealed that there was no significant difference between mean times between the two types of iPhones. Because tSTAT = –0.4970 or p-value = 0.6210, one would not reject H0. that there is a difference in time to charge to 50% capacity between the iPhone 14 Pro and iPhone 14 Pro Max.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems clxix 10.70

An analysis of the data in 10.67 revealed that a pallet of Vermont shingles weighed significantly more than a pallet of Boston shingles. Assuming that a heavier weight shingle is associated with higher quality, the Vermont shingle might be perceived as a higher quality shingle compared to the Boston shingle. The below figure shows the results from a separate-variance t test.

An analysis of the data in 10.68 revealed that the Vermont shingles were associated with less granule loss compared to the Boston shingles following accelerated-life testing. Shingles with less weight loss are assumed to have a longer life expectancy. Both shingles would be expected to outperform the length of the warranty period because their weight losses were well below the 0.8 gram threshold. However, the Vermont shingles would be expected to have a longer life expectancy given that they loss less weight relative to the Boston shingles. The below figure shows the results from a pooled-variance t test.

Taken together, the results from 10.67 and 10.68 suggest that the Vermont shingle may be a higher quality shingle based on pallet weight and life expectancy as determined by granule loss associated with accelerated-life testing. These conclusions suggest that the manufacturer may be able to charge more for the Vermont shingle compared to the Boston shingle. Copyright ©2024 Pearson Education, Inc.



Chapter 11

11.1

(a) (b) (c)

df A = c – 1 = 5 – 1 = 4 df W = n – c = 30 – 5 = 25 df T = n – 1 = 30 – 1 = 29

11.2

(a)

SSW = SST – SSA = 210 – 60 = 150 SSA 60 MSA    15 c –1 5 –1 SSW 150 MSW   6 n – c 30 – 5 MSA 15 FSTAT    2.5 MSW 6

(b) (c) (d) 11.3

(a) Source Among groups Within groups Total

11.4

df

SS 60 150 210

4 25 29

MS 15 6

F 2.5

(b) (c) (d)

F(4, 29) = 2.70 Decision rule: If F > 2.70, reject H0. Decision: Since F = 2.5 is less than the critical bound 2.70, do not reject H0.

(a) (b) (c)

df A = c – 1 = 3 – 1 = 2 df W = n – c = 15 – 3 = 12 df T = n – 1 = 15 – 1 = 14

11.5 Source Among groups Within groups Total

11.6

(a) (b) (c)

(d)

df 4 – 1 =3 24 – 4 = 20 24 – 1 = 23

SS (80) (3) = 240 480 240 + 480 = 720

MS 80 480/20 = 24

F 80/24 = 3.33

Decision rule: If FSTAT > 3.10, reject H0. Since FSTAT = 3.33 is greater than the critical bound of 3.10, reject H0. There are c = 4 degrees of freedom in the numerator and n – c = 24 – 4 = 20 degrees of freedom in the denominator. For 4 degrees of freedom in the numerator and 20 degrees in the denominator, the critical value, Q  3.96. To perform the Tukey-Kramer procedure, the critical range is

Q

MSW  1 1  24  1 1      3.96     7.92 2  n j n j  2 6 6

Copyright ©2024 Pearson Education, Inc. v


vi Chapter 11: Analysis of Variance

11.7

(a)

At the 0.05 level of significance, there is no evidence that there are differences among the four toasting times. Because FSTAT = 1.14 OR p-value = 0.389, do not reject H0. Due to the small sample size within each group (3), caution should be taken when interpreting these results. Further analyses with larger sample sizes is recommended. (b)

Because the FSTAT statistic was not significant, it would not be appropriate to perform post-hoc comparisons to determine differences among the groups.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems

11.7 cont.

vii

(c)

(d)

Because the Levene test FSTAT = 0.13 or p-value = 0.939, there is no evidence that the variances are different among the 20 sec, 40 sec, 60 sec, and 80 sec groups. Although the FSTAT was not significant, the four toasting time groups had slightly different means. The means are 0.967, 1.367, 1.300, and 0.667 for the 20 sec, 40 sec, 60 sec, and 80 sec groups, respectively. There is no evidence that these differences could be attributed to toasting times.

Copyright ©2024 Pearson Education, Inc.


viii Chapter 11: Analysis of Variance

11.8

(a)Null hypothesis, H0: All means are equal,  A  B  C  D Alternative hypothesis, H1: At least one mean is different. c – 1 = 4 – 1 = 3, n = 4(8) = 32, n – c = 32 – 4 = 28 From PHStat ANOVA: Single Factor

SUMMARY Groups

Count

Sum

Average

Variance

1309

163.625 14513.9821

Europe

8

Americas

8 1236.77 154.59625

1986.5641

Asia

8

1050

131.25

901.6429

Africa

8

1115

139.375

855.9821

df

MS

F

P-value

0.3740

0.7724 2.9467

ANOVA Source of Variation

SS

Between Groups

5120.9418

3 1706.9806

Within Groups

127807.1988

28 4564.5428

Total

132928.1406

31 Level of significance

F crit

0.05

SSA 5,120.9418   1,706.9806 c 1 4 1 SSW 127,807.1988 MSW    4,564.5428 nc 32  4 MSA 1,706.9806 FSTAT    0.3740 MSW 4,564.5428 Because the p-value is 0.7724 and FSTAT = 0.3740 < 2.95, do not reject H0. There is insufficient evidence of a difference in the mean export price across the four global regions. (b) There are c = 4 degrees of freedom in the numerator and n – c = 32 – 4 = 28 degrees of freedom in the denominator. The table does not have 28 degrees of freedom in the denominator, so use the next larger critical value, Q  3.90 . MSA 

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems

ix

To perform the Tukey-Kramer procedure, the critical range is

Q

MSW  1 1  4,564.5428  1 1      3.90     93.158 2  n j n j  2 8 8

From the Tukey-Kramer procedure, there is no difference in the export prices of the four regions.

Copyright ©2024 Pearson Education, Inc.


x Chapter 11: Analysis of Variance

11.8 cont.

ANOVA output for Levene’s test for homogeneity of variance: From PHStat

(c)

ANOVA: Levene Test

SUMMARY Groups

Count

Sum

Europe

8

409

Americas

8 251.23

Asia

8

Africa

8

Average

Variance

51.125 13603.2679 31.40375

925.4298

152

19

586.8571

163

20.375

435.5536

df

MS

ANOVA Source of Variation

SS

Between Groups

5287.7656

3 1762.5885

Within Groups

108857.7588

28 3887.7771

Total

114145.5244

31

F

P-value

0.4534

0.7170 2.9467

Level of significance

F crit

0.05

SSA 5, 287.7656   1,762.5885 c 1 4 1 SSW 108,857,7588 MSW    3,887.7771 nc 32  4 MSA 1,762.5885 FSTAT    0.4534 MSW 3,887.7771 Because the p-value is 0.7170 > 0 05 and FSTAT = 0.4534 < 2.9467, do not reject H0. There is insufficient evidence to conclude that the variances in the export prices across the four global regions are different. MSA 

(d)

From the results in (a) and (b), there is no difference in the mean and variance of the export prices in the different regions.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems

11.9

xi

(a)

H 0 : 1  2  3  4

H1 : Not all  j are equal

where 1 = Main, 2 = Satellite 1, 3 = Satellite 2, 4 = Satellite 3 where j = 1, 2, 3, 4

Decision Rule: If p-value < 0.05, reject H0. Since p-value = 0.0009 < 0.05, reject the null hypothesis. There is enough evidence to conclude that there is a significant difference in the mean waiting time in the four locations.

Copyright ©2024 Pearson Education, Inc.


xii

Chapter 11: Analysis of Variance

11.9 cont.

(b)

(c)

From the Tukey Pairwise Comparison procedure, there is a difference in mean waiting time between the main campus and Satellite 1, and the main campus and Satellite 3. H0: 12   22   32   42 H1: At least one variance is different. Source of Variation Between Groups Within Groups

SS

df

MS

F

P-value

F crit

310.979 7078.435

3 56

103.6597 126.4006

0.8201

0.4883

2.7694

Total

7389.414

59

Since the p-value = 0.4883 > 0.05, do not reject H0. There is not enough evidence to conclude there is a significant difference in the variation in waiting time among the four locations.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems

11.10

(a)

H0:  A  B  C  D  E

H1: At least one mean is different.

Since the p-value is essentially zero, reject H0. There is evidence of a difference in the mean rating of the five advertisements.

Copyright ©2024 Pearson Education, Inc.

xiii


xiv

Chapter 11: Analysis of Variance

11.10 cont.

(b)

(c)

(d)

There is a difference in the mean rating between advertisement A and C, between A and D, between B and C, between B and D and between D and E. H0:  A2   B2   C2   D2   E2 H1: At least one variance is different. ANOVA output for Levene’s test for homogeneity of variance: ANOVA Source of Variation Between Groups Within Groups

SS 14.13333 45.83333

Total

59.96667

df 4 25

MS 3.533333 1.833333

F 1.927273

P-value F crit 0.137107 2.758711

29

Since the p-value = 0.137 > 0.05, do not reject H0. There is not enough evidence to conclude there is a difference in the variation in rating among the five advertisements. There is no significant difference between advertisements A and B, and they have the highest mean rating among the five and should be used. There is no significant difference between advertisements C and D, and they are among the lowest in mean rating and should be avoided.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems

11.11

xv

(a)Null hypothesis, H0: All means are equal Alternative hypothesis, H1: At least one mean is different. c – 1 = 6 – 1 = 5, n = = 50, n – c = 50 – 6 = 44 From PHStat ANOVA: Single Factor SUMMARY Groups Burger Snack Chicken Global Sandwich Pizza

Count

Sum 14 27970 6 4713 9 18012 6 9361 9 11928 6 4905

ANOVA Source of Variation Between Groups Within Groups

SS 11814909.0324 29779445.5476

Total

41594354.5800

Average Variance 1997.857143 897397.3626 785.5 165132.3000 2001.333333 1379336.5000 1560.166667 152194.9667 1325.333333 658107.7500 817.5 45417.9000

df

MS 5 2362981.8065 44 676805.5806

F P-value F crit 3.4914 0.0096 2.4270

49 Level of significance

0.05

SSA 11,814,909.0324   2,362,981.8065 c 1 6 1 SSW 29,779, 445.5476 MSW    676,805.5806 nc 50  6 MSA 2,362,981.8065 FSTAT    3.4914 MSW 676,805.5806 Because the p-value is 0.0096 < 0 05 and FSTAT = 3.4914 > 2.4270, reject H0. At the 0.05 level of significance, there is sufficient evidence that there are differences in U.S. mean sales per unit among the six food segments. MSA 

Copyright ©2024 Pearson Education, Inc.


xvi

Chapter 11: Analysis of Variance

11.11 cont.

(b)

c – 1 = 6 – 1 = 5, n = = 50, n – c = 50 – 6 = 44 From PHStat

ANOVA: Levene Test

SUMMARY Groups

Count

Sum

Average

Variance

Burger

14 9754 696.7142857 532090.1813

Snack

6 1669 278.1666667

Chicken

9 6945 771.6666667 800384.5000

Global

6 1667 277.8333333

Sandwich

9 5319

Pizza

6

74656.5667

60806.9667

591 294086.7500

923 153.8333333

18575.4667

ANOVA Source of Variation

SS

df

MS

Between Groups

2558287.0629

5 511657.4126

Within Groups

16443137.3571

44 373707.6672

Total

19001424.4200

49

F

P-value

1.3691

0.2543 2.4270

Level of significance

F crit

0.05

SSA 2,558, 287.0629   511,657.4126 c 1 6 1 SSW 16, 443,137.5371 MSW    373,707.6713 nc 50  6 MSA 511,657.4126 FSTAT    1.3691 MSW 373,707.6713 Because the p-value is 0.2543 > 0 05 and FSTAT = 1.3691 < 2.4270, do not reject H0. There is insufficient evidence that the variances are different among the six regions. MSA 

(c)

Because the Levene test did not reveal significant differences in variation in U.S. average sales per unit, the results in (a) would be valid. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems

11.11 cont.

(d)

xvii

There are c = 6 degrees of freedom in the numerator and n – c = 44 degrees of freedom in the denominator. The table does not have 44 degrees of freedom in the denominator, so use the next larger critical value, Q  4.23 . From PHStat

Tukey-Kramer Multiple Comparisons Sample Sample Mean Size 1997.857 14 785.5 6 2001.333 9 1560.167 6 1325.333 9 817.5 6

Group 1: Burger 2: Snack 3: Chicken 4: Global 5: Sandwich 6: Pizza

Absolute Std. Error Critical Comparison Difference of Difference Range Results Group 1 to Group 2 1212.357 283.8522378 1201 Means are different Group 1 to Group 3 3.47619 248.5396104 1051 Means are not different Group 1 to Group 4 437.6905 283.8522378 1201 Means are not different Group 1 to Group 5 672.5238 248.5396104 1051 Means are not different Group 1 to Group 6 1180.357 283.8522378 1201 Means are not different Group 2 to Group 3 1215.833 306.5954584 1297 Means are not different Group 2 to Group 4 774.6667 335.8584971 1421 Means are not different Group 2 to Group 5 539.8333 306.5954584 1297 Means are not different Group 2 to Group 6 32 335.8584971 1421 Means are not different Group 3 to Group 4 441.1667 306.5954584 1297 Means are not different Group 3 to Group 5 676 274.2273146 1160 Means are not different Group 3 to Group 6 1183.833 306.5954584 1297 Means are not different Group 4 to Group 5 234.8333 306.5954584 1297 Means are not different Group 4 to Group 6 742.6667 335.8584971 1421 Means are not different Group 5 to Group 6 507.8333 306.5954584 1297 Means are not different

Other Data Level of significance 0.05 Numerator d.f. 6 Denominator d.f. 44 MSW 676805.6 Q Statistic 4.23

Although the ANOVA FSTAT value for the overall F test was not significant at the 0.05 significance level, the Tukey Pairwise Comparison procedure revealed no pairwise differences among the six food segments at the 0.05 significance level. The results should be interpreted with caution due to the relatively small sample sizes. 11.12

(a)

Source

Degrees of Freedom

Sum of Squares

Mean Squares

F

Among Groups

2

62,160,064,576.58

31,080,032,288.29

1.2239

Within Groups

41

1,041,172,839,201.60

25,394,459,492.722

Total

43

1,103,332,903,778.18

(b)

(c)

Because FSTAT = 1.2239 < 3.23, do not reject H0. There is insufficient evidence that there are significant differences in mean brand value among the financial institution, technology, and telecom sectors. Because the results in (b) indicated that there were no differences in mean brand value among the three sectors, it would not be appropriate to use the Tukey-Kramer procedure.

Copyright ©2024 Pearson Education, Inc.


xviii Chapter 11: Analysis of Variance

11.13

H0: 1  2  3  4  5 H1: At least one mean is different. Population: 1 = Kidney, 2 = Shrimp, 3 = Chicken Liver, 4 = Salmon, 5 = Beef Decision rule: df: 4, 45. If p-value < 0.05, reject H0.

(a)

ANOVA Source of Variation Between Groups Within Groups

3.65896

Total

SS

MS

F

P-value

F crit

4

0.91474

20.80541

9.15E-10

2.578739

1.97849

45

0.04397

5.63745

49

Test statistic: FSTAT = 20.8054 p-value is essentially 0 Decision: Since p-value < 0.05, reject H0. There is evidence of a significant difference in the mean amount of food eaten among the various products. To determine which of the means are significantly different from one another, you use the Tukey-Kramer procedure.

(b)

Tukey Kramer Multiple Comparisons Sample Sample Group Mean Size 1 2.456 10

Comparison Group 1 to Group 2

Absolute Difference 0.047

Std. Error of Difference 0.0663072

Critical Range 0.2513

2

2.409

10

Group 1 to Group 3

0.088

0.0663072

0.2513

3

2.368

10

Group 1 to Group 4

0.428

0.0663072

0.2513

4

2.028

10

Group 1 to Group 5

0.702

0.0663072

0.2513

5

1.754

10

Group 2 to Group 3

0.041

0.0663072

0.2513

Group 2 to Group 4

0.381

0.0663072

0.2513

Group 2 to Group 5

0.655

0.0663072

0.2513

0.05

Group 3 to Group 4

0.34

0.0663072

0.2513

5

Group 3 to Group 5

0.614

0.0663072

0.2513

45

Group 4 to Group 5

0.274

0.0663072

0.2513

Other Data Level of significance Numerator d.f. Denominator d.f. MSW Q Statistic

df

Results Means are not different Means are not different Means are different Means are different Means are not different Means are different Means are different Means are different Means are different Means are different

0.043966 3.79

The kidney-based, shrimp-based and chicken-liver-based products are not significantly different while the salmon based and beef based products are significantly different from the others and from each other.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems

11.13 cont.

(c)

(d)

11.14

xix

H0:          H1: At least one variance is different. Excel output for Levene’s test for homogeneity of variance: 2 1

2 2

2 3

2 4

2 5

ANOVA Source of Variation Between Groups Within Groups

SS 0.13412 0.71373

4 45

Total

0.84785

49

df

MS 0.03353 0.015861

F 2.114035

P-value 0.094673

F crit 2.578739

Since the p-value = 0.0947 > 0.05, do not reject H0. There is not enough evidence to conclude there is a significant difference in the variation in the amount of food eaten among the various products. The pet food company should conclude that the mean amount of cat food eaten for the kidney-based, shrimp-based and chicken-liver-based products are not significantly different from each other but are significantly higher than salmon-based products and the mean amount eaten for salmon-based products is significantly higher than for beef-based products.

(a)Null hypothesis, H0: All means are equal, 1  2  3  4 Alternative hypothesis, H1: At least one mean is different. c – 1 = 4 – 1 = 3, n = 4(10) = 40, n – c = 40 – 4 = 36 From PHStat ANOVA: Single Factor SUMMARY Groups

Count

Sum

Average

Variance

Asia

10

345

34.5

39.3889

Europe

10

421

42.1

65.8778

North America

10

275

27.5

21.6111

South America

10

301

30.1

72.9889

ANOVA Source of Variation

SS

df

MS

Between Groups

1225.1000

3 408.3667

Within Groups

1798.8000

36

Total

3023.9000

39

F 8.1728

P-value

F crit

0.0003 2.8663

49.9667

Level of significance Copyright ©2024 Pearson Education, Inc.

0.05


xx

Chapter 11: Analysis of Variance

SSA 1225.1000 SSW 1798.8000   408.3667 MSW    49.9667 c 1 4 1 nc 40  4 MSA 408.3667 FSTAT    8.1728 MSW 49.9667 At the 0.05 level of significance, there is evidence there are differences in congestion levels among the four continents. Because p-value = 0.003 < 0 05 and FSTAT = 8.1728 > 2.8663, reject H0. The ANOVA FSTAT indicates that there are differences in congestion levels among the four continents. However, it does not indicate which groups differ from one another. MSA 

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems

11.14 cont.

(b)

xxi

There are c = 4 df for the numerator and n – c = 36 df for the denominator. The table does not have 36 degrees of freedom in the denominator, so use the next larger critical value, Q  3.84 . From PHStat

Tukey-Kramer Multiple Comparisons

Group 1: Asia 2: Europe 3: North America 4: South America

Sample Sample Mean Size 34.5 10 42.1 10 27.5 10 30.1 10

Absolute Std. Error Critical Comparison Difference of Difference Range Results Group 1 to Group 2 7.6 2.2353225 8.584 Means are not different Group 1 to Group 3 7 2.2353225 8.584 Means are not different Group 1 to Group 4 4.4 2.2353225 8.584 Means are not different Group 2 to Group 3 14.6 2.2353225 8.584 Means are different Group 2 to Group 4 12 2.2353225 8.584 Means are different Group 3 to Group 4 2.6 2.2353225 8.584 Means are not different

Other Data Level of significance 0.05 Numerator d.f. 4 Denominator d.f. 36 MSW 49.96667 Q Statistic 3.84

Because the results from (a) revealed a significant difference in congestion levels among the four continents, the Tukey-Kramer procedure is appropriate. Europe is different from North America and South America, and North America is different from South America. (c)

The ANOVA F test assumes randomness and independence of samples, normally distributed data in all groups, and equal variances among groups.

(d)Null hypothesis, H0: All variances are equal. Alternative hypothesis, H1: At least one variance is different. c – 1 = 4 – 1 = 3, n = 4(10) = 40, n – c = 40 – 4 = 36 From PHStat ANOVA: Levene Test

SUMMARY Groups

Count

Sum Average Variance

Asia

10

43

4.3

21.3444

Europe

10

67

6.7

23.5111

North America

10

37

3.7

7.5111

South America

10

65

6.5

26.0556

ANOVA Source of Variation Between Groups

SS

df

MS

F

69.6000

3

23.2000

1.1833

Copyright ©2024 Pearson Education, Inc.

P-value

F crit

0.3297 2.8663


xxii

Chapter 11: Analysis of Variance

Within Groups

705.8000

36

Total

775.4000

39

19.6056

Level of significance

0.05

Because the Levene test FSTAT = 1.1833 < 2.8663 or p-value = 0.3297 > 0.05, do not reject H0. There is insufficient evidence that the variances are different among the four regions.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems

11.15

xxiii

df A = r – 1 = 3 – 1 = 2 df B = c – 1 = 3 – 1 = 2 df AB = (r – 1)(c – 1) = (3 – 1)( 3 – 1) = 4 df E = rc(n՛ – 1) = (3)(3)(3 – 1) = 18 df T = n – 1 = rc n՛ – 1 = (3)(3)(3) – 1 = 26

(a) (b) (c) (d)

11.16 Source

11.17

df

SS

MS

Factor A

2

120

(b) 120 ÷ 2 = 60

Factor B

2

110

(b) 110 ÷ 2 = 55

Interaction, AB

4

(a) 540 – 120 – 110 – 270 = 40

(c) 40 ÷ 4 = 10

Error, E

18

270

(d)270 ÷ 18 = 15

Total, T

26

540

(a) (b) (c) (d)

540 – 120 – 110 – 270 = 40 120 ÷ 2 = 60 and 110 ÷ 2 = 55 40 ÷ 4 = 10 270 ÷ 18 = 15

(a) (b)

FSTAT  MSAB  MSE  10  15  0.67 FSTAT  MSA  MSE  60  15  4

(c)

FSTAT  MSB  MSE  55  15  3.67

F

(d) Source

df

SS

MS

F

Factor A

2

120

120 ÷ 2 = 60

(b) 60 ÷ 15 = 4

Factor B

2

110

110 ÷ 2 = 55

(c) 55 ÷ 15 = 3.67

Interaction, AB

4

540 – 120 – 110 – 270 = 40

40 ÷ 4 = 10

(a) 10 ÷ 15 = 0.67

Error, E

18

270

270 ÷ 18 = 15

Total, T

26

540

11.18 (a) (b) (c)

F(2, 18) = 3.55F(4, 18) = 2.93 Factor A Decision: Since FSTAT = 4.00 is greater than the critical bound of 3.55, reject H0. There is evidence of a difference among factor A means. Factor B Decision: Since FSTAT = 3.67 is greater than the critical bound of 3.55, reject H0. There is evidence of a difference among factor B means. Interaction, AB Decision: Since FSTAT = 0.67 is less than the critical bound of 2.93, do not reject H0. There is insufficient evidence to conclude there is an interaction effect. Copyright ©2024 Pearson Education, Inc.


xxiv

11.19

Chapter 11: Analysis of Variance

(a)

r = 3, c = 4, n՛ = 12 Source

df

SS

MS

F

Factor A

3–1=2

18

18 ÷ 2 = 9

9 ÷ 0.45 = 19.8

Factor B

4–1=3

64

64 ÷ 3 = 21.33

21.33 ÷ 0.45 = 46.93

Interaction, AB

2(3) = 6

150 – 18 – 64 – 60 = 8

8 ÷ 6 = 1.33

1.33 ÷ 0.45 = 2.93

Error, E

3(4)(11) = 132

60

60 ÷ 132 = 0.45

Total, T

143

150

The table does not have 132 degrees of freedom in the denominator, so use the next larger critical value, with 120 degrees of freedom. F(2, 120) = 3.07 (b)

F(6, 120) = 2.17

Factor A Decision: Since FSTAT = 19.8 is greater than the critical bound of 3.07, reject H0. There is evidence of a difference among factor A means. Factor B Decision: Since FSTAT = 46.93 is greater than the critical bound of 2.68, reject H0. There is evidence of a difference among factor B means. Interaction, AB Decision: Since FSTAT = 2.93 is greater than the critical bound of 2.17, reject H0. There is evidence to conclude there is an interaction effect.

(c) 11.19 cont.

F(3, 120) = 2.68

(d)

11.20 Source

df

SS

MS

F

Factor A

2

2  80 = 160

80

80 ÷ 5 = 16

Factor B

8 2 = 4

220

220 ÷ 4 = 55

11

Interaction, AB

8

8  10 = 80

10

10 ÷ 5 = 2

Error, E

30

30  5 = 150

55 ÷ 11 = 5

Total, T

44

160 + 220 + 80 + 150 = 610

11.21

F(2, 30) = 3.32F(4, 30) = 2.69F(8, 30) = 2.27 (a) Decision: Since FSTAT = 16 is greater than the critical bound of 3.32, reject H0. There is evidence of a difference among factor A means. (b) Decision: Since FSTAT = 11 is greater than the critical bound of 2.69, reject H0. There is evidence of a difference among factor B means. (c) Decision: Since FSTAT = 2 is less than the critical bound of 2.27, do not reject H0. There is insufficient evidence to conclude there is an interaction effect.

11.22

Two-way ANOVA output:

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems

(a)

(b)

(c)

xxv

H0: There is no interaction between die temperature and die diameter. H1: There is an interaction between die temperature and die diameter. Decision: Since FSTAT = 3.4032 < 4.3512, do not reject H0. There is insufficient evidence to conclude that there is any interaction between die temperature and die diameter. H0: 145..  155.. H1: 145  155 Decision: Since FSTAT = 1.85 < 4.3512, do not reject H0. There is insufficient evidence to conclude that there is an effect due to die temperature. H0: .3mm.  .4mm. H1: 3mm  4 mm Decision: Since FSTAT = 9.45 > 4.3512, reject H0. There is sufficient evidence to conclude that there is an effect due to die diameter.

Copyright ©2024 Pearson Education, Inc.


xxvi

11.22 cont.

Chapter 11: Analysis of Variance

(d)

(e)

At 5% level of significance, die diameter has an effect on the density while the die temperature does not have any impact on the density. There is no significant interaction between die diameter and die temperature.

11.23

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems

11.23 cont.

xxvii

Excel Two-way ANOVA output: ANOVA Source of Variation Sample Columns Interaction Within

SS 2.7812 1.8760 0.7526 43.0284

Total

48.4382

(a)

(b)

(c)

df 1 1 1 20

MS 2.7812 1.8760 0.7526 2.1514

F 1.2927 0.8720 0.3498

P-value 0.2690 0.3615 0.5608

F crit 4.3512 4.3512 4.3512

Level of significance

0.05

23

H0: There is no interaction between die temperature and die diameter. H1: There is an interaction between die temperature and die diameter. Decision: Since FSTAT = 0.35 < 4.3512, do not reject H0. There is insufficient evidence to conclude that there is any interaction between die temperature and die diameter. H0: 145..  155.. H1: 145  155 Decision: Since FSTAT = 1.29 < 4.3512, do not reject H0. There is insufficient evidence to conclude that there is an effect due to die temperature. H0: .3mm.  .4mm. H1: 3mm  4 mm Decision: Since FSTAT = 0.87 < 4.3512, do not reject H0. There is insufficient evidence to conclude that there is an effect die diameter.

(d)

(e)

At 5% level of significance, neither die diameter nor die temperature have a significant effect on the foam diameter. There is no significant interaction between die diameter and die temperature.

Copyright ©2024 Pearson Education, Inc.


xxviii

Chapter 11: Analysis of Variance

11.24

(a)

H0: There is no interaction between filling time and mold temperature. H1: There is an interaction between filling time and mold temperature. 0.1136 = 2.27 < 2.9277 or the p-value = 0.102 > 0.05, do not reject H0. 0.05 There is insufficient evidence of interaction between filling time and mold temperature. FSTAT = 9.02 > 3.5546, reject H0. There is evidence of a difference in the warpage due to the filling time. FSTAT = 4.23 > 3.5546, reject H0. There is evidence of a difference in the warpage due to the mold temperature.

Because FSTAT = (b) (c) (d)

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems

11.24 cont.

(e)

(f)

The Tukey procedure revealed that the 3 sec filling time group was significantly higher than the 2 sec and 1 sec groups, which were not significant from each other. The Tukey procedure also revealed that the 60°C group was different than the 85°C group. Both filling time and temperature had main effects on warpage. Although the interaction plot appeared to show an interaction between filling time and temperature, it was not significant at the 0.05 significance level. The warpage for a three-second filling time seems to be much higher at 60°C and 72.5°C but not at 85°C. Caution should be used in interpreting this non-significant interaction due to relatively small sample sizes.

11.25

(a)

(b)

xxix

At the 0.05 significance level, there is insufficient evidence of an interaction between breakoff pressure and stopper height. Because FSTAT = 0.23 or p-value = 0.640, do not reject H0. At the 0.05 significance level, there is no evidence of an effect of breakoff pressure. Because FSTAT = 1.56 or p-value = 0.235, do not reject H0.

Copyright ©2024 Pearson Education, Inc.


xxx

11.25 cont.

11.26

Chapter 11: Analysis of Variance

(c)

At the 0.05 significance level, there is no evidence of an effect of stopper height. Because FSTAT = 1.56 or p-value = 0.235, do not reject H0.

(d)

The results from (a) through (d) indicate that neither stopper height nor breakoff pressure had an effect on the percentage of breakoff chips. There was no interaction between breakoff pressure and stopper height. The mean breakoff percentage for both the two and three breakoff pressure categories was higher for the twenty stopper height. The interaction plot revealed that the lines representing the means for the two and three breakoff pressures were parallel across the two heights. An interaction between the two variables would have been reflected by non-parallel lines. This did not occur. The interaction plot confirmed the FSTAT = 0.23 for the interaction between breakoff pressure and height.

(a)

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems

11.26 cont.

(a)

(b) (c)

xxxi

FSTAT = 0.83, p-value = 0.38 > 0.05, do not reject H0. There is not enough evidence to conclude that there is an interaction between zone lower and zone 3 upper. FSTAT = 0.38, p-value is 0.548 > 0.05, do not reject H0. There is insufficient evidence to conclude that there is an effect due to zone 1 lower. FSTAT = 0.10, p-value = 0.752 > 0.05, do not reject H0. There is inadequate evidence to conclude that there is an effect due to zone 3 upper.

(d)

(e)

A large difference at a zone 3 upper of 695°C but only a small difference at zone 3 upper of 715°C. Because this difference appeared on the cell means plot but the interaction was not statistically significant because of the large MSE. Caution should be used in interpreting these findings due to small sample sizes. Further testing should be done with larger sample sizes.

11.27

The among-groups variance MSA represents variation among the means of the different groups. The within groups-variance MSW measures variation within each group.

11.28

The completely randomized design evaluates one factor of interest, in which sample observations are randomly and independently drawn. The randomized block design also evaluates one factor of interest, but sample observations are divided into blocks according to common characteristics to reduce within group variation. The two-factor factorial design evaluates two factors of interest and the interaction between these two factors.

11.29

The major assumptions of ANOVA are randomness and independence, normality, and homogeneity of variance.

11.30 If the populations are approximately normally distributed and the variances of the groups are approximately equal, you select the one-way ANOVA F test to examine possible differences among the means of c independent populations. Copyright ©2024 Pearson Education, Inc.


xxxii

Chapter 11: Analysis of Variance

11.31

When the ANOVA has indicated that at least one of the groups has a different population mean than the others, you should use multiple comparison procedures for evaluating pairwise combinations of the group means. In such cases, the Tukey-Kramer procedure should be used to compare all pairs of means.

11.32

The one-way ANOVA F test for a completely randomized design is used to test for the existence of treatment effect of the treatment variable on the mean level of the dependent variable, while the Levene test is used to test whether the amounts of variation of the dependent variable are the same across the different categories of the treatment variable.

11.33

You should use the two-way ANOVA F test to examine possible differences among the means of each factor in a factorial design when there are two factors of interest that are to be studied and more than one observation can be obtained for each treatment combination (to measure the interaction of the two factors).

11.34

Interaction measures the difference in the effect of one variable for the different levels of the second factor. If there is no interaction, any difference between levels of one factor will be the same at each level of the second factor.

11.35

You can obtain the interaction effect and carry out an F test for its significance. In addition, you can develop a plot of the response for each level of one factor at each level of a second factor.

11.36

(a)

H0: There is no interaction H1: There is an interaction Decision: Since FSTAT = 0.01 < 2.9011, do not reject H0. There is not enough evidence to conclude that there is an interaction between supplier and loom.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems

11.36

xxxiii

(b)

H0:  jetta..  turk.. H1: At least one mean differs.

(c)

Decision: Since FSTAT = 0.81 < 4.1491, do not reject H0. There is insufficient evidence to conclude that there is an effect due to loom. H0: .1.  .2.  .3.  .4. H1: At least one mean differs. Decision: Since FSTAT = 5.20 > 2.9011, reject H0. There is adequate evidence to conclude that there is an effect due to suppliers.

cont.

(d)

Cell Means Plot 30 25 20 jetta

15

turk 10 5 0

Supplier 1

(e)

Supplier 2

Supplier 3

Supplier 4

Output of the Tukey Procedure: For different suppliers, Q = 3.84 with numerator d.f. = 4 and denominator d.f. = 32.

Tukey-Kramer Multiple Comparisons

Group 1: Supplier 1 2: Supplier 2 3: Supplier 3 4: Supplier 4

Sample Sample Mean Size 18.97 10 23.9 10 22.41 10 20.83 10

Other Data Level of significance Numerator d.f. Denominator d.f. MSE Q Statistic

0.05 4 32 8.61225 3.84

Absolute Std. Error Critical Comparison Difference of Difference Range Results Group 1 to Group 2 4.93 0.92802209 3.5636 Means are different Group 1 to Group 3 3.44 0.92802209 3.5636 Means are not different Group 1 to Group 4 1.86 0.92802209 3.5636 Means are not different Group 2 to Group 3 1.49 0.92802209 3.5636 Means are not different Group 2 to Group 4 3.07 0.92802209 3.5636 Means are not different Group 3 to Group 4 1.58 0.92802209 3.5636 Means are not different

There is a difference in mean strength between supplier 1 and supplier 2 only.

Copyright ©2024 Pearson Education, Inc.


xxxiv

Chapter 11: Analysis of Variance

11.36 cont.

(f)

H0: 1  2  3  4 H1: At least one mean differs. Decision: Since FSTAT = 5.70 > 2.8663, reject H0. There is adequate evidence to conclude that there is an effect due to suppliers. Output of Tukey-Kramer Procedure: Tukey-Kramer Multiple Comparisons

Group 1: Supplier 1 2: Supplier 2 3: Supplier 3 4: Supplier 4

Sample Sample Mean Size 18.97 10 23.9 10 22.41 10 20.83 10

Other Data Level of significance 0.05 Numerator d.f. 4 Denominator d.f. 36 MSW 7.856972 Q Statistic 3.79

Absolute Std. Error Critical Comparison Difference of Difference Range Results Group 1 to Group 2 4.93 0.88639564 3.3594 Means are different Group 1 to Group 3 3.44 0.88639564 3.3594 Means are different Group 1 to Group 4 1.86 0.88639564 3.3594 Means are not different Group 2 to Group 3 1.49 0.88639564 3.3594 Means are not different Group 2 to Group 4 3.07 0.88639564 3.3594 Means are not different Group 3 to Group 4 1.58 0.88639564 3.3594 Means are not different

The result is consistent with that in (b) and (e) except the Turkey-Kramer Procedure in the one-way ANOVA concludes that not only there is a difference in mean strength between supplier 1 and supplier 2, but there is also a difference in mean strength between supplier 1 and suppler 3.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems

xxxv

11.37

(a)

(b) (c)

H0: There is no interaction. H1: There is an interaction. Decision: Since FSTAT = 23.79 > 4.3512, reject H0. There is enough evidence to conclude that there is an interaction between machine type and reduction angle. Since there is an interaction between machine type and reduction angle, it is inappropriate to test whether there is an effect due to machine type. Since there is an interaction between machine type and reduction angle, it is inappropriate to test whether there is an effect due to reduction angle.

(d) Cell Means Plot 94.5 94 93.5 W95

93

W96 92.5 92 91.5

Narrow

(e)

Wide

There is enough evidence to conclude that there is an interaction between machine type and reduction angle. Since there is an interaction between machine type and reduction angle, it is inappropriate to test whether there is an effect due to machine type. Likewise, since there is an interaction between machine type and reduction angle, it is inappropriate to test whether there is an effect due to reduction angle.

Copyright ©2024 Pearson Education, Inc.


xxxvi

Chapter 11: Analysis of Variance

11.37 cont.

(f)

H0: narrow   wide H1: At least one mean differs. Decision: Since FSTAT = 16.70 > 4.3009, reject H0. There is adequate evidence to conclude that there is an effect due to reduction angle. You conclude that there is adequate evidence of an effect due to reduction angle here while in (c) and (e), it is inappropriate to test whether there is an effect due to reduction angle since there is an interaction between machine type and reduction angle. 11.38

(a)

To test the homogeneity of variance, you perform a Levene’s Test. H0: 12   22   32 H1: Not all  2j are the same Excel output: ANOVA Source of Variation

SS

df

MS

Between Groups

0.07

2

0.035

Within Groups

7.03

15

0.468667

Total

7.1

17

F 0.07468

P-value

F crit

0.928383

3.682317

Since the p-value = 0.928 > 0.05, do not reject H0. There is not enough evidence of a significant difference in the variances of the breaking strengths for the three air-jet pressures. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems

11.38 cont.

(b)

xxxvii

H0: 1  2  3 H1: At least one of the means differs. Decision rule: If FSTAT > 3.68, reject H0.

Decision: Since FSTAT = 4.09 is greater than the critical bound of 3.68, reject H0. There is enough evidence to conclude that the mean breaking strengths differ for the three air-jet pressures. (c)

(d)

Breaking strength scores under 30 psi are significantly higher than those under 50 psi. No other pairwise comparisons were significant at the 0.05 significance level. Other things being equal, use 30 psi. Copyright ©2024 Pearson Education, Inc.


xxxviii

Chapter 11: Analysis of Variance

11.39

(a)

(b)

(c)

H0: There is no interaction between side-to-side aspect and air-jet pressure. H1: There is an interaction between side-to-side aspect and air-jet pressure. Decision rule: If FSTAT > 3.89, reject H0.

Test statistic: FSTAT = 1.9719 Decision: Since FSTAT = 1.97 is less than the critical bound of 3.89, do not reject H0. There is insufficient evidence to conclude there is an interaction between side-to-side aspect and air-jet pressure. H0: 1  2 H1: 1  2 Decision rule: If FSTAT > 4.75, reject H0. Test statistic: FSTAT = 4.87 Decision: Since FSTAT = 4.87 is greater than the critical bound of 4.75, reject H0. There is sufficient evidence to conclude that mean breaking strength does differ between the two levels of side-to-side aspect. H0: 1  2  3 H1: At least one of the means differ. Decision rule: If FSTAT > 3.89, reject H0. Test statistic: FSTAT = 5.67 Decision: Since FSTAT = 5.67 is greater than the critical bound of 3.89, reject H0. There is enough evidence to conclude that the mean breaking strengths differ for the three air-jet pressures.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems

11.39 cont.

xxxix

(d)

(e)

(f) (g)

Mean breaking strengths under 30 psi are higher than those under 40 psi or 50 psi. The mean breaking strength is highest under 30 psi. The two-factor experiment gave a more complete, refined set of results than the onefactor experiment. Not only was the side-to-side aspect factor significant, the application of the Tukey procedure on the air-jet pressure factor determined that breaking strength scores are highest under 30 psi.

Copyright ©2024 Pearson Education, Inc.


xl Chapter 11: Analysis of Variance

11.40 ANOVA Source of Variation

SS

Df

MS

F

P-value

F crit

Sample

112.5603

1

112.5603

30.4434

3.07E-06

4.113165

Columns

46.01025

1

46.01025

12.4441

0.001165

4.113165

Interaction

0.70225

1

0.70225

0.1899

0.665575

4.113165

Within

133.105

36

3.697361

Total

292.3778

39

(a)

(b)

(c)

H0: There is no interaction between type of breakfast and desired time. H1: There is an interaction between type of breakfast and desired time. Decision rule: If FSTAT > 4.1132, reject H0. Test statistic: FSTAT = 0.1899. Decision: Since FSTAT = 0.1899 is less than the critical bound of 4.1132, do not reject H0. There is insufficient evidence to conclude that there is any interaction between type of breakfast and desired time. H0: 1  2 H1: 1  2 Population 1 = Continental, 2 = American Decision rule: If FSTAT > 4.1132, reject H0. Test statistic: FSTAT = 30.4434. Decision: Since FSTAT = 30.4434 is greater than the critical bound of 4.1132, reject H0. There is sufficient evidence to conclude that there is an effect that is due to type of breakfast. H0: 1  2 H1: 1  2 Population 1 = Early, 2 = Late Decision rule: If FSTAT > 4.1132, reject H0. Test statistic: FSTAT =12.4441. Decision: Since FSTAT =12.4441 is greater than the critical bound of 4.1132, reject H0. There is sufficient evidence to conclude that there is an effect that is due to desired time

(d)

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems

xli

Mean Delivery Time Difference 7 6

Time (minutes)

5 4

Continental American

3 2 1 0 Early Time Period

Late Time Period

Desired Time Period

. At the 5% level of significance, both the type of breakfast ordered and the desired time have an effect on delivery time difference. There is no interaction between the type of breakfast ordered and the desired time. Two-way ANOVA output from Excel: (e)

11.41

ANOVA Source of Variation

SS

df

MS

F

P-value

F crit

Sample

55.46025

1

55.46025

14.99995

0.000436

4.113165

Columns

13.11025

1

13.11025

3.54584

0.067795

4.113165

Interaction

5.40225

1

5.40225

1.46111

0.234633

4.113165

Within

133.105

36

3.697361

Total

207.0778

39

(a)

(b)

(c)

H0: There is no interaction between type of breakfast and desired time. H1: There is an interaction between type of breakfast and desired time. Decision rule: If FSTAT > 4.1132, reject H0. Test statistic: FSTAT = 1.4611. Decision: Since FSTAT = 1.4611 is less than the critical bound of 4.1132, do not reject H0. There is insufficient evidence to conclude that there is any interaction between type of breakfast and desired time. H0: 1  2 H1: 1  2 Population 1 = Continental, 2 = American Decision rule: If FSTAT > 4.1132, reject H0. Test statistic: FSTAT = 15 Decision: Since FSTAT = 15 is greater than the critical bound of 4.1132, reject H0. There is sufficient evidence to conclude that there is an effect that is due to type of breakfast. H0: 1  2 H1: 1  2 Population 1 = Early, 2 = Late Copyright ©2024 Pearson Education, Inc.


xlii

Chapter 11: Analysis of Variance

Decision rule: If FSTAT > 4.1132, reject H0. Test statistic: FSTAT =3.5458. Decision: Since FSTAT =3.5458 is less than the critical bound of 4.1132, do not reject H0. There is insufficient evidence to conclude that there is an effect that is due to desired time (d)

(e)

11.42

. At 5% level of significance, only the type of breakfast ordered has an effect on delivery time difference. There is no interaction between the type of breakfast ordered and the desired time.

H0: There is no interaction between the size of the pieces and the can fill height. H1: There is an interaction between the size of the pieces and the can fill height. Decision rule: If p-value < 0.05, reject H0.

Test statistic: FSTAT = 0.2169 Decision: Since p-value is 0.6428, do not reject H0. There is not sufficient evidence to conclude there is an interaction between the size of the pieces and the can fill height. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems

xliii

Both piece size and fill height had significant main effects. The FSTAT was much higher for piece size, which suggest that this factor may be of more importance relative to fill height. Filling cans with fine cut size led to more accuracy in relation to the weight of the can compared to the label weight. Filling cans with fine cut size and lower fill height led to the most accuracy in relation to the weight of the can compared to the label weight. Based on these results, one would recommend that the pet food company use the fine cut size and the lower fill height.

Copyright ©2024 Pearson Education, Inc.


xliv

Chapter 11: Analysis of Variance

11.42 cont.

A One-Way Anova on fill height produced a FSTAT = 18.44. Because FSTAT = 18.44 or p-value = 0.000, reject H0.

A One-Way Anova on cut size produced a FSTAT = 223.98. Because FSTAT = 223.98 or p-value = 0.000, reject H0.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems

11.43

H0: There is no interaction between the linear widths and height placement. H1: There is an interaction between the linear widths and height placement. PHStat Two-Way ANOVA with replication

SUMMARY

Below Eye Level

At Eye Level

Above Eye Level

Total

5 feet Count

2

2

2

6

Sum

44

47

41

132

Average

22

23.5

20.5

22

Variance

2.0000

12.5000

4.5000

5.6000

Count

2

2

2

6

Sum

54

47

52

153

Average

27

23.5

26

25.5

Variance

2.0000

0.5000

2.0000

3.5000

Count

2

2

2

6

Sum

64

63

67

194

Average

32

31.5

Variance

8.0000

4.5000

4.5000

Count

6

6

6

Sum

162

157

160

Average

27 26.16666667

26.66666667

6 feet

7 feet

33.5 32.33333333

Total

Copyright ©2024 Pearson Education, Inc.

4.2667

xlv


xlvi

Chapter 11: Analysis of Variance

Variance

22.4000

20.5667

36.2667

PHStat Two-way ANOVA output: ANOVA Source of Variation

SS

df

MS

F

P-value

F crit

Sample

331.4444

2

165.7222

36.8272

0.0000

4.2565

Columns

2.1111

2

1.0556

0.2346

0.7956

4.2565

Interaction

24.2222

4

6.0556

1.3457

0.3255

3.6331

Within

40.5000

9

4.5000

Total

398.2778

17 Level of significance

0.05

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems

11.43 cont. Decision rule: If FSTAT > 4.2565, reject H0. Test statistic: FSTAT = 36.8272. Decision: Since FSTAT = 36.8272 is greater than the critical bound of 4.2565, reject H0. Decision rule: If FSTAT > 4.2565, reject H0. Test statistic: FSTAT = 0.2346. Decision: Since FSTAT = 0.2346 is less than the critical bound of 4.2565, do not reject H0. Decision rule: If FSTAT > 3.6331, reject H0. Test statistic: FSTAT = 1.3457. Decision: Since FSTAT = 1.3457 is less than the critical bound of 3.6331, do not reject H0.

Copyright ©2024 Pearson Education, Inc.

xlvii



Chapter 12

12.1

12.2

(a)

For df = 1 and  = 0.01,  2 = 6.635.

(b)

For df = 1 and  = 0.005,  2 = 7.879.

(c)

For df = 1 and  = 0.10,  2 = 2.706.

(a)

For df = 1 and  = 0.05,  2 = 3.841.

(b)

For df = 1 and  = 0.025,  2 = 5.024.

(c)

For df = 1 and  = 0.01,  2 = 6.635.

X 1  X 2 20  40 60 4     0.4444 n1  n2 50  85 135 9 1A: p  0.4444 and n1  50, so f e  22.22 1B: p  0.4444 and n2  85, so f e  37.78 2A: 1  p  0.5556 and n1  50, so f e  27.78 2B: 1  p  0.5556 and n2  85, so f e  47.22

12.3 (a)–(b)

(c)

p

1A Observed Freq Expected Freq 1B Observed Freq Expected Freq 20 22.22 40 37.78 2A Observed Freq Expected Freq 2B Observed Freq Expected Freq 30 27.78 45 47.22

Total Obs, Row 1 60 Total Obs, Row 2 75

Total Obs, Col A 50

GRAND TOTAL 135

Total Obs, Col B 85

( f 0 – f e ) 2 (20  22.22) 2 (40  37.78) 2 (30  27.78) 2 (45  47.22) 2     fe 22.22 37.78 27.78 47.22 All Cells

2  STAT  

 0.634 2 Since STAT = 0.634 < 3.841, it is not significant at the 5% level of significance. 12.4

(a)

(b)

p

X 1  X 2 20  30 1    0.5 n1  n2 50  50 2

Observed Freq Expected Freq 20 25 chi-sq contrib= 1.00 Observed Freq Expected Freq 30 25 chi-sq contrib= 1.00

Observed Freq 30 chi-sq contrib= Observed Freq 20 chi-sq contrib=

Total Obs, Col 1 50

Total Obs, Col 2 50

Expected Freq 25 1.00 Expected Freq 25 1.00

Decision rule: If  2 > 3.841, reject H0. Copyright ©2024 Pearson Education, Inc.

v

Total Obs, Row 1 50 Total Obs, Row 2 50 GRAND TOTAL 100


vi Chapter 16: Time-Series Forecasting

( f 0 – f e )2 = 1.00 + 1.00 + 1.00 + 1.00 = 4 fe All Cells

2 Test statistic:  STAT  

( f 0 – f e )2 = 4 is greater than the critical value of 3.841, it fe All Cells is significant at the 5% level of significance. 2 Decision: Since  STAT  

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems vii 12.5

(a) Chi-Square Test

Observed Frequencies Subscribers Churn

Basic

Premium

Total

Yes

883

873

1756

No

1858

1903

3761

Total

2741

2776

5517

Expected Frequencies Subscribers Churn

Basic

Premium

Total

Yes

872.4299 883.5701

1756

No

1868.57

1892.43

3761

Total

2741

2776

5517

Data Level of Significance

0.05

Number of Rows

2

Number of Columns

2

Degrees of Freedom

1

Results Critical Value

3.841459

Chi-Square Test Statistic

0.373342

Copyright ©2024 Pearson Education, Inc.


viii Chapter 16: Time-Series Forecasting p-Value

0.541188

Do not reject the null hypothesis H0: 1   2 H1:  1   2 2 The STAT = 0.373342 < 3.841459 for α = 0.05. Do not reject H0.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ix 12.5 cont.

(b) Chi-Square Test

Observed Frequencies Subscriber Churn

Basic

Premium

Total

Yes

16

15

31

No

34

35

69

Total

50

50

100

Premium

Total

Expected Frequencies Subscriber Churn

Basic Yes

15.5

15.5

31

No

34.5

34.5

69

Total

50

50

100

Data Level of Significance

0.05

Number of Rows

2

Number of Columns

2

Degrees of Freedom

1

Results Critical Value

3.841459

Chi-Square Test Statistic

0.046751

Copyright ©2024 Pearson Education, Inc.


x Chapter 16: Time-Series Forecasting p-Value

0.828817

Do not reject the null hypothesis H0: 1   2 H1:  1   2 2 The STAT = 0.046751 < 3.841459 for α = 0.05. Do not reject H0.

(c)

(d)

The results in parts (a) and (b) are the same, do not reject H0. The p-values are different. 2 A p-value of 0.541188 indicates that the probability of obtaining a STAT of 0.373342 or larger is 54.1188% when the null hypothesis is true. A p-value of 0.828817 indicates that 2 the probability of obtaining a STAT of 0.046751 or larger is 82.8817% when the null hypothesis is true. 2 The results of (a) and (b) are exactly the same as those of Problem 10.29. The STAT in (a) and the Z in Problem 10.29 (a) satisfy the relationship that 2 STAT  0.3733  Z 2  (0.6610)2 , and the p-value in (b) is exactly the same as the p-value in Problem 10.29 (b), 0.8288.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xi 12.6

(a)

H0: 1   2 H1:  1   2 Chi-Square Test

Observed Frequencies Purchases Consumed Coffee

Yes

No

Total

Caffeinated

40

60

100

Decaffeinated

10

90

100

Total

50

150

200

Expected Frequencies Purchases Consumed Coffee

Yes

No

Total

Caffeinated

25

75

100

Decaffeinated

25

75

100

Total

50

150

200

Data Level of Significance

0.05

Number of Rows

2

Number of Columns

2

Degrees of Freedom

1

Results Critical Value

3.841459

Chi-Square Test Statistic

24

Copyright ©2024 Pearson Education, Inc.


xii Chapter 16: Time-Series Forecasting p-Value

9.63E-07

Reject the null hypothesis

(b)

2 Decision rule: df = 1. If  STAT > 3.841 or p-value < 0.05, reject H0. 2 Test statistic:  STAT = 24 Decision: Since  2 STAT = 24 is larger than the upper critical bound of 3.841, reject H0. There is evidence to conclude that the population proportion of those who drink caffeinated coffee is different from those who do not drink caffeinated coffee.

(c)

The p-value is 0.0000. The probability of obtaining a test statistic of 24.0 or larger when the null hypothesis is true is 0.0000.

(d)

You should not compare the results in (b) to those of Problem 10.30 because Problem 10.30 was a one-tail test.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xiii 12.7

From PHStat Chi-Square Test

Observed Frequencies Workers Training

U.S.

Canada

Total

Yes

463.5

350

813.5

No

566.5

680

1246.5

Total

1030

1030

2060

U.S.

Canada

Total

Yes

406.75

406.75

813.5

No

623.25

623.25

1246.5

Total

1030

1030

2060

Expected Frequencies Workers Training

Data Level of Significance

0.05

Number of Rows

2

Number of Columns

2

Degrees of Freedom

1

Results Critical Value

3.841459

Chi-Square Test Statistic

26.17032

Copyright ©2024 Pearson Education, Inc.


xiv Chapter 16: Time-Series Forecasting p-Value

3.13E-07

Reject the null hypothesis

(a)

At the 0.05 significance level, there is evidence of a difference in the proportion of U.S and Canadian workers whose organization provides explicit training on for all people 2 managers. Because STAT = 26.17032 > 3.841 or p-value = 0.0000, reject H0.

(b)

The p-value of 0.0000 is below the 0.05 significance level, which would allow one to reject the H0 that the percentage of U.S. and Canadian workers whose organization provides explicit training for all people managers is equal. A p-value of 0.0000 implies 2 that the probability of obtaining a STAT of 26.17032 or larger is 0% when H0 is true.

(c)

2 The results of (a) and (b) are exactly the same as those of Problem 10.31. The STAT in (a) and the Z in Problem 10.31 (a) satisfy the relationship that 2 STAT  26.17032  Z 2  (5.1377)2 , and the p-value is exactly the same as the p-value in Problem 10.31, 0.0000.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xv 12.8

(a)

H0:  1   2 . H1:  1   2 . Chi-Square Test

Observed Frequencies U.S. Organizations Providing Benefits

HR

Workers

Total

Effective

1112

302

1414

Not Effective

625

340

965

Total

1737

642

2379

Expected Frequencies U.S. Organizations Providing Benefits

Workers

Total

Effective

1032.416 381.5839

1414

Not Effective

704.5839 260.4161

965

Total

HR

1737

642

Data Level of Significance

0.01

Number of Rows

2

Number of Columns

2

Degrees of Freedom

1

Results Critical Value

6.634897

Chi-Square Test Statistic

56.04305

Copyright ©2024 Pearson Education, Inc.

2379


xvi Chapter 16: Time-Series Forecasting p-Value

7.09E-14

Reject the null hypothesis 2 Because STAT = 56.0431 > 6.635, reject H0. There is evidence of a difference in the proportion of HR and workers with respect to the proportion that rated their organizations as effective in providing affordable and comprehensive healthcare benefits.

(b)

The p-value = 0.0000. The probability of obtaining a difference in proportions that gives rise to a test statistic above 56.0431 is 0.0000 if there is no difference in the proportion in the two groups.

(c) and (d) The results of (a) and (b) are exactly the same as those of Problem 10.32. The  2 in (a) and the Z in Problem 10.32 (a) satisfy the relationship that

 2 = 56.0431 = Z2 = (7.4862)2, and the p-value in (b) is exactly the same as the p-value computed in Problem 10.32 (b).

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xvii 12.9

From PHStat, LinkedIn Chi-Square Test

Observed Frequencies Marketers Use LinkedIn

B2B

B2C

Total

Yes

1246

1514

2760

No

292

1343

1635

Total

1538

2857

4395

B2C

Total

Expected Frequencies Marketers Use LinkedIn

B2B Yes

965.843 1794.157

2760

No

572.157 1062.843

1635

Total

1538

2857

Data Level of Significance

0.05

Number of Rows

2

Number of Columns

2

Degrees of Freedom

1

Results Critical Value

3.841459

Chi-Square Test Statistic

336.0363

Copyright ©2024 Pearson Education, Inc.

4395


xviii Chapter 16: Time-Series Forecasting p-Value

4.66E-75

Reject the null hypothesis (a)

At the 0.05 significance level, there is evidence of a significance difference in the percentage of B2B and B2C marketers that use LinkedIn. 2 Because STAT = 336.0363 > 3.841 or p-value =0.0000, reject H0. The results indicate that a significant higher proportion (81%) of B2B marketers utilized LinkedIn than the proportion (53%) of B2C marketers who utilized LinkedIn.

(b)

The p-value of 0.000 is well below the 0.05 significance level, which would allow one to reject the H0 that the percentage of B2B and B2C who utilize LinkedIn is equal.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xix 12.9 cont.

(b)

From PHStat, YouTube Chi-Square Test

Observed Frequencies Marketers Use YouTube

B2B

B2C

Total

Yes

877

1571

2448

No

662

1099

1761

Total

1539

2670

4209

B2C

Total

Expected Frequencies Marketers Use YouTube

B2B Yes

895.0991 1552.901

2448

No

643.9009 1117.099

1761

Total

1539

2670

Data Level of Significance

0.05

Number of Rows

2

Number of Columns

2

Degrees of Freedom

1

Results Critical Value

3.841459

Chi-Square Test Statistic

1.378887

Copyright ©2024 Pearson Education, Inc.

4209


xx Chapter 16: Time-Series Forecasting p-Value

0.240291

Do not reject the null hypothesis (c)

At the 0.05 significance level, there is insufficient evidence of a significance difference in the percentage of B2B and B2C marketers that use YouTube. 2 Because STAT = 1.378887 < 3.841 or p-value =0.240291, do not reject H0.

(d)

The p-value of 0.240291 is well above the 0.05 significance level, which would allow one not to reject the H0 that the percentage of B2B and B2C marketers who utilize YouTube is equal.

(e)

2 The results of (a) and (c) are exactly the same as those of Problem 10.33. The STAT in (a) and the Z in Problem 10.33 (a) satisfy the relationship that 2 STAT  336.0363  Z 2  (18.3313)2 , and the p-value is exactly the same as the p-value in Problem 10.31, 0.0000.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxi 12.10

(a)From PHStat Chi-Square Test

Observed Frequencies Region Prefer Open-air

Northeast

Midwest

Total

Yes

63

44

107

No

133

164

297

Total

196

208

404

Midwest

Total

Yes

51.91089 55.08911

107

No

144.0891 152.9109

297

Expected Frequencies Region Prefer Open-air

Northeast

Total

196

208

Data Level of Significance

0.05

Number of Rows

2

Number of Columns

2

Degrees of Freedom

1

Results Critical Value

3.841459

Chi-Square Test Statistic

6.258608

Copyright ©2024 Pearson Education, Inc.

404


xxii Chapter 16: Time-Series Forecasting p-Value

0.012359

Reject the null hypothesis (b)

2 Because STAT = 6.258608 > 3.841, reject H0. There is evidence that there is a significant difference between the proportion of shoppers in the Northeast and Midwest who prefer open-air markets.

(c)

The p-value = 0.012359. The probability of obtaining a test statistic of 6.258608 or larger when the null hypothesis is true is 0.012359.

(d)

The confidence interval is from 0.0119 to 0.1865. The results are identical because (2.5017)2 = 6.2586.

12.11

(a) (b)

df = (r – 1)(c – 1) = (2 – 1)(4 – 1) = 3  2 = 7.815

(c)

 2 = 11.345

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxiii 12.12

p

(a)

X 1  X 2  X 3 10  20  50 90    0.4 n1  n2  n3 50  75  100 225 Row 1 A: p  0.4 and n1  50, so f e  20 Row 1 B: p  0.4 and n1  75, so f e  30 Row 1 C: p  0.4 and n1  100, so f e  40 Thus, the expected frequencies in the first row are 20, 30, and 40. Row 2 A: 1  p  0.6 and n1  50, so f e  30 Row 2 B: 1  p  0.6 and n1  75, so f e  45

(b)

12.13

Row 2 C: 1  p  0.6 and n1  100, so f e  60 Thus, the expected frequencies in the second row are 30, 45, and 60. 2 = 12.500. The critical value with 2 degrees of freedom and  = 0.05 is 5.991. The STAT result is deemed significant.

X 1  X 2  X 3 20  25  25 70    0.4667 n1  n2  n3 50  50  50 150 (a) The calculations for A, B, and C of Row 1 are identical. Row 1 A: p  0.4667 and n1  50, so f e  23.3333 Thus, the expected frequencies in the first row are 23.3333, 23.3333, and 23.3333. p

The calculations for A, B, and C of Row 2 are identical. Row 2 A: 1  p  0.5333 and n1  50, so f e  26.6667 Thus, the expected frequencies in the second row are 26.6667, 26.6667, and 26.6667. (b)

2 = 1.339286. The critical value with 2 degrees of freedom and  = 0.05 is 5.991. STAT

The result is not deemed significant.

Copyright ©2024 Pearson Education, Inc.


xxiv Chapter 16: Time-Series Forecasting 12.14

(a) Chi-Square Test

Observed Frequencies Column variable Owns Smartphone

18-29

30-49

50-64

65+

Total

Yes

192

190

166

122

670

No

8

10

34

78

130

Total

200

200

200

200

800

Expected Frequencies Column variable Owns Smartphone

18-29

30-49

50-64

65+

Total

Yes

167.5

167.5

167.5

167.5

670

No

32.5

32.5

32.5

32.5

130

Total

200

200

200

200

800

Data Level of Significance

0.05

Number of Rows

2

Number of Columns

4

Degrees of Freedom

3

Results Critical Value

7.814728

Chi-Square Test Statistic

116.7945

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxv p-Value

3.78E-25

Reject the null hypothesis

Because the calculated test statistic 116.7945 is greater than the critical value of 7.8147, you reject H0 and conclude that there is evidence of a difference among the age groups in the proportion of smartphone owners. (b)

p-value = 0.0000. The probability of obtaining a data set that gives rise to a test statistic of 116.7945 or more is 0.0000 if there is no difference in the proportion of smartphone owners.

Copyright ©2024 Pearson Education, Inc.


xxvi Chapter 16: Time-Series Forecasting 12.14 cont.

(c) Marascuilo Procedure

Level of Significance

0.05

Square Root of Critical Value

2.795483483

Sample Proportions Group 1 (18-29)

0.96

Group 2 (30-49)

0.95

Group 3 (50-64)

0.83

Group 4 (65+)

0.61

MARASCUILO TABLE Absolute Differences

Proportions

Critical Range

| Group 1 - Group 2 |

0.01 0.057934667 Not significant

| Group 1 - Group 3 |

0.13 0.083747945 Significant

| Group 1 - Group 4 |

0.35 0.103904026 Significant

| Group 2 - Group 3 |

0.12

| Group 2 - Group 4 |

0.34 0.105601216 Significant

| Group 3 - Group 4 |

0.22 0.121691862 Significant

0.08584456 Significant

There is a significant difference between 18- to 29-year-olds and 50- to 64-years-olds and those 65 and older. There is a significant difference between 30- to 49-year-olds and 50- to 64-years-olds and those 65 and older. There is a significant difference between those who are between 50 and 64 years old and those 65 or older.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxvii 12.15

From PHStat Chi-Square Test

Observed Frequencies Education Owns Smartphone

HS Grad

Some College

College Grad

Total

Yes

150

356

372

878

No

50

44

28

122

Total

200

400

400

1000

Expected Frequencies Education Owns Smartphone

HS Grad

Some College

College Grad

Total

Yes

175.6

351.2

351.2

878

No

24.4

48.8

48.8

122

Total

200

400

400

1000

Data Level of Significance

0.05

Number of Rows

2

Number of Columns

3

Degrees of Freedom

2

Results Critical Value

5.991465

Chi-Square Test Statistic

41.22633

Copyright ©2024 Pearson Education, Inc.


xxviii Chapter 16: Time-Series Forecasting p-Value

1.12E-09

Reject the null hypothesis (a)

At the 0.05 significance level, because the calculated test statistic 41.22633 is greater than the critical value of 5.991, you reject H0 and conclude that there is evidence of a difference among the age groups in the proportion of U.S. adult smartphone owners.

(b)

p-value = 0.0000. The probability of obtaining a data set that gives rise to a test statistic of 41.22633 or more is 0.0000 if there is no difference in the proportion of U.S. adult smartphone owners.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxix 12.15 cont.

(c) Marascuilo Procedure

Level of Significance

0.05

Square Root of Critical Value

2.447746831

Sample Proportions Group 1 (HS Grad)

0.75

Group 2 (Some College)

0.89

Group 3 (College Grad)

0.93

MARASCUILO TABLE Absolute Differences

Proportions

Critical Range

| Group 1 - Group 2 |

0.14

0.08416299 Significant

| Group 1 - Group 3 |

0.18 0.081191803 Significant

| Group 2 - Group 3 |

0.04 0.049411758 Not significant

There is a significant difference between High School Graduates and Some College Education and those College Graduates. There is a significant difference between Some College Education and College Graduates.

Copyright ©2024 Pearson Education, Inc.


xxx Chapter 16: Time-Series Forecasting 12.16

(a)

H0: 1   2   3 . H1: At least one proportion differs. Chi-Square Test Observed Frequencies Residence Smartphone Owner

Rural

Suburb

Urban

Total

Yes

160

336

356

852

No

40

64

44

148

Total

200

400

400

1000

Suburb

Urban

Total

Expected Frequencies Residence Smartphone Owner

Rural

Yes

170.4

340.8

340.8

852

No

29.6

59.2

59.2

148

Total

200

400

400

1000

Data Level of Significance

0.05

Number of Rows

2

Number of Columns

3

Degrees of Freedom

2

Results Critical Value

5.991465

Chi-Square Test Statistic

9.326228

p-Value

0.009437

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxxi Reject the null hypothesis Because 9.326228 > 5.9915, reject H0. There is a significant difference among areas of residence with respect to the proportion who own smartphones. (b)

p-value = 0.0094. The probability of a test statistic greater than 9.3262 is 0.0094.

(c)

Level of Significance Square Root of Critical Value

0.05 2.447746831

Sample Proportions

Group 1 (Rural) Group 2 (Suburb) Group 3 (Urban

0.8 0.84 0.89 Marascuilo Table Absolute Differences Critical Range

Proportions | Group 1 – Group 2 | | Group 1 – Group 3 | | Group 2 – Group 3 |

0.04 0.09

0.082500326 0.079117524

Not significant Significant

0.05

0.058987652

Not significant

Ownership of a smartphone is different between those who live in rural areas and those who live in urban areas. 12.17

H0: 1   2   3 . H1: At least one proportion differs. Chi-Square Test

Observed Frequencies Residence Smartphone Owner

Rural

Suburb

Urban

Total

Yes

80

84

89

253

No

20

16

11

47

Total

100

100

100

300

Urban

Total

Yes

84.33333 84.33333 84.33333

253

No

15.66667 15.66667 15.66667

47

Expected Frequencies Residence Smartphone Owner

Rural

Suburb

Copyright ©2024 Pearson Education, Inc.


xxxii Chapter 16: Time-Series Forecasting Total

100

100

100

300

Data Level of Significance

0.05

Number of Rows

2

Number of Columns

3

Degrees of Freedom

2

Results Critical Value

5.991465

Chi-Square Test Statistic

3.077958

p-Value

0.2146

Do not reject the null hypothesis (a)

Because 3.077958 < 5.9915, do not reject H0. There is not enough evidence of a significant difference among areas of residence with respect to the proportion who own smartphones. p-value = 0.2146. The probability of a test statistic greater than 3.077958 is 0.2146.

(b)

2 2 The STAT is sensitive to sample size. The STAT with the same proportions (80%, 84%, 2 89%) that are smartphone owners were associated with a much larger STAT 2 (9.3262 ) when n = 200, 400, or 400 for each group. The STAT was 3.077958 when n = 100 with the same proportions (80%, 84%, 89%). The chi-square test for differences among more than two populations requires that each table cell have a sufficient expected frequency. Each cell should have an expected frequency of at least 1.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxxiii 12.18

(a)

From PHStat Chi-Square Test

Observed Frequencies Activities Used Device

Smartphone

Computer

Tablet

Total

Yes

225

156

174

555

No

75

144

126

345

Total

300

300

300

900

Expected Frequencies Activities Used Device

Smartphone

Computer

Tablet

Total

Yes

185

185

185

555

No

115

115

115

345

Total

300

300

300

900

Data Level of Significance

0.05

Number of Rows

2

Number of Columns

3

Degrees of Freedom

2

Results Critical Value

5.991464547

Chi-Square Test Statistic

36.12690952

Copyright ©2024 Pearson Education, Inc.


xxxiv Chapter 16: Time-Series Forecasting p-Value

1.42936E-08

Reject the null hypothesis 2 Because STAT = 36.1269 > 5.9915, reject H0. There is evidence of a difference in the percentage who use their device to check social media while watching TV between the groups.

(b)

p-value = 0.0000.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxxv 12.18 cont.

(b) Marascuilo Procedure

Level of Significance

0.05

Square Root of Critical Value

2.447746831

Sample Proportions Group 1 (Smartphone)

0.75

Group 2 (Computer)

0.52

Group 3 (Tablet)

0.58

MARASCUILO TABLE Absolute Differences

Proportions

(c)

Critical Range

| Group 1 - Group 2 |

0.23 0.093432135 Significant

| Group 1 - Group 3 |

0.17 0.092788655 Significant

| Group 2 - Group 3 |

0.06 0.099247004 Not significant

Cellphone versus computer 0.23 > 0.0934. Significant. Smartphone versus tablet: 0.17 > 0.0928. Significant. Computer versus tablet: 0.06 < 0.0992. Not significant. The smartphone group is different from the computer and tablet groups.

Copyright ©2024 Pearson Education, Inc.


xxxvi Chapter 16: Time-Series Forecasting 12.19

(a)

From PHStat

Chi-Square Test

Observed Frequencies Country At least 30% women

Australia

Germany

Ireland

Spain

Swiss

Total

Yes

48

47

11

16

21

143

No

14

10

9

3

22

58

Total

62

57

20

19

43

201

Expected Frequencies Country At least 30% women

Australia

Germany

Ireland

Spain

Swiss

Total

Yes

44.10945 40.55224 14.22886 13.51741 30.59204

143

No

17.89055 16.44776 5.771144 5.482587 12.40796

58

Total

62

57

20

Data Level of Significance

0.05

Number of Rows

2

Number of Columns

5

Degrees of Freedom

4

Results Critical Value

9.487729

Chi-Square Test Statistic

19.28403

Copyright ©2024 Pearson Education, Inc.

19

43

201


Solutions to End-of-Section and Chapter Review Problems xxxvii p-Value

0.000691

Reject the null hypothesis

(b)

At the 0.05 significance level, there is evidence that there are differences among countries in the proportion of companies that have at least three female directors on their 2 boards. Because STAT = 19.28403 > 9.487729 or p-value =0.0007, reject H0. The p-value of 0.0007 is well below the 0.05 significance level, which would allow one to reject the H0 that there are no differences among countries in the proportion of companies that have at least 30% women directors on their boards. The probability that a 2 of 19.28403 or larger would be observed if H0 is true is 0.0007. STAT

Copyright ©2024 Pearson Education, Inc.


xxxviii Chapter 16: Time-Series Forecasting 12.19 cont.

(c) Marascuilo Procedure

Level of Significance

0.05

Square Root of Critical Value

3.080215745

Sample Proportions Group 1 (Australia)

0.774193548

Group 2 (Germany)

0.824561404

Group 3 (Ireland)

0.55

Group 4 (Spain)

0.842105263

Group 5 (Swiss)

0.488372093

MARASCUILO TABLE Proportions

Absolute Differences

Critical Range

| Group 1 - Group 2 |

0.050367855

0.225456989 Not significant

| Group 1 - Group 3 |

0.224193548

0.379687583 Not significant

| Group 1 - Group 4 |

0.067911715

0.305201793 Not significant

| Group 1 - Group 5 |

0.285821455

0.286152749 Not significant

| Group 2 - Group 3 |

0.274561404

0.376150883 Not significant

| Group 2 - Group 4 |

0.01754386

0.30079056 Not significant

| Group 2 - Group 5 |

0.33618931

0.281443107 Significant

| Group 3 - Group 4 |

0.292105263

0.428726915 Not significant

| Group 3 - Group 5 |

0.061627907

0.415381787 Not significant

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxxix

| Group 4 - Group 5 |

0.35373317

0.348607951 Significant

2 Because STAT was significant at 0.05 level, it is appropriate to perform the Marascuilo procedure. At the 0.05 level of significance, the Marascuilo procedure revealed that a significantly higher proportion of German companies have at least 30% women directors compared to Swiss companies. The procedure also revealed that a significantly higher proportion of Spanish companies have at least 30% women board directors compared to Swiss companies.

12.20

df = (r – 1)(c – 1) = (3 – 1)(4 – 1) = 6

12.21

(a)

df = 9,  2  16.919(d) df = 10,  2  18.307

(b)

df = 9,  2  21.666(e)df = 10,  2  23.209

(c)

df = 15,  2  30.578

Copyright ©2024 Pearson Education, Inc.


xl Chapter 16: Time-Series Forecasting 12.22

H0: There is no relationship between type of dessert and type of entrée. H1: There is a relationship between type of dessert and type of entrée. 2   Test statistic:  STAT

 f0  fe 

2

92.1028 fe Decision: Since the calculated test statistic 92.1028 is larger than the critical value of 16.9190, you reject H0 and conclude that there is enough evidence of a relationship between type of dessert and type of entrée. All cells

12.23

From PHStat Chi-Square Test

Observed Frequencies Generation Employment Status

Gen Z

Millennials

Gen X

Boomers

Total

Working Full-time

210

280

291

290

1071

Working Part-time

170

98

85

97

450

Freelancer

110

129

136

69

444

Project-based

11

15

10

5

41

Total

501

522

522

461

2006

Boomers

Total

Expected Frequencies Generation Employment Status

Gen Z

Millennials

Gen X

Working Full-time

267.4831

278.6949 278.6949 246.1271

1071

Working Part-time

112.3878

117.0987 117.0987 103.4148

450

Freelancer

110.8893

115.5374 115.5374 102.0359

444

Project-based

10.23978

10.66899 10.66899 9.422233

41

Total

501

522

Data Copyright ©2024 Pearson Education, Inc.

522

461

2006


Solutions to End-of-Section and Chapter Review Problems xli Level of Significance

0.05

Number of Rows

4

Number of Columns

4

Degrees of Freedom

9

Results Critical Value

16.91898

Chi-Square Test Statistic

82.39589

p-Value

5.39E-14

Reject the null hypothesis

12.24

At the 0.05 level of significance, there is sufficient evidence of a significant relationship between 2 generation and employment status. Because STAT = 82.39589 > 16.919 or p-value = 0.0000, reject H0. From PHStat Chi-Square Test

Observed Frequencies Age Group Support Quiet Quitting

18-29

30-44

45-64

65+

Total

Strongly Support

53

36

58

28

175

Somewhat support

57

56

107

73

293

Somewhat oppose

29

30

64

45

168

Strongly Oppose

27

12

39

26

104

Not sure

56

63

89

41

249

Total

222

197

357

213

989

Expected Frequencies

Copyright ©2024 Pearson Education, Inc.


xlii Chapter 16: Time-Series Forecasting Age Group Support Quiet Quitting

18-29

30-44

45-64

65+

Total

Strongly Support

39.2821 34.85844 63.16987 37.68959

175

Somewhat support

65.76946 58.36299 105.7644 63.10313

293

Somewhat oppose

37.71082 33.46411 60.64307

36.182

168

23.34479 20.71587 37.54095 22.39838

104

55.89282 49.59858

Strongly Oppose Not sure Total

222

197

89.8817

53.6269

249

357

213

989

Data Level of Significance

0.01

Number of Rows

5

Number of Columns

4

Degrees of Freedom

12

Results Critical Value

26.21697

Chi-Square Test Statistic

26.75745

p-Value

0.008373

Reject the null hypothesis 2 Decision: Because STAT = 26.757 > 26.217 or p-value = 0.008373, reject H0. There is evidence to conclude that there is a relationship between age and supporting quiet quitting.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xliii 12.25

From PHStat Chi-Square Test

Observed Frequencies Geographic Region Cloud Deployment

Americas

Asia-Pacific

EMEA

Total

Hybrid Cloud

595

294

190

1079

Private Cloud

646

154

185

985

Public Cloud

306

196

85

587

No Cloud Deployment

153

56

40

249

Total

1700

700

500

2900

EMEA

Total

Hybrid Cloud

632.5172 260.4482759 186.0345

1079

Private Cloud

577.4138 237.7586207 169.8276

985

Public Cloud

344.1034 141.6896552 101.2069

587

No Cloud Deployment

145.9655 60.10344828 42.93103

249

Expected Frequencies Geographic Region Cloud Deployment

Americas

Total

1700

Asia-Pacific

700

500

Data Level of Significance

0.05

Number of Rows

4

Number of Columns

3

Degrees of Freedom

6 Copyright ©2024 Pearson Education, Inc.

2900


xliv Chapter 16: Time-Series Forecasting

Results Critical Value

12.59159

Chi-Square Test Statistic

74.09251

p-Value

5.9E-14

Reject the null hypothesis At the 0.05 level of significance, there is evidence of a significant relationship between 2 geographic region and cloud deployment. Because STAT = 74.093 > 12.592 or p-value =0.000, 2 reject H0. The probability of observing a STAT of 74.093 or larger, when H0 is true, is close to 0.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xlv 12.26

From PHStat Chi-Square Test Observed Frequencies Geographic Region North America

Number of Tech M&A

Europe

AsiaPacific

Latin America

Total

1

14

20

15

10

59

2-3

44

44

50

17

155

4-6

24

21

22

3

70

7 or more

8

5

3

0

16

Total

90

90

90

30

300

Expected Frequencies Geographic Region North America

Number of Tech M&A

Europe

AsiaPacific

Latin America

Total

1

17.7

17.7

17.7

5.9

59

2-3

46.5

46.5

46.5

15.5

155

4-6

21

21

21

7

70

7 or more

4.8

4.8

4.8

1.6

16

Total

90

90

90

30

300

Data Level of Significance

0.05

Number of Rows

4

Number of Columns

4

Degrees of Freedom

9 Copyright ©2024 Pearson Education, Inc.


xlvi Chapter 16: Time-Series Forecasting Results Critical Value

16.9189776

Chi-Square Test Statistic

12.18932412

p-Value

0.202845358

Do not reject the null hypothesis 2 Because STAT = 12.189 < 16.919 do not reject H0. There is no evidence of a relationship between number of Tech M&A and geographic region.

12.27

(a)

The lower and upper critical values are 29 and 55, respectively.

(b)

The lower and upper critical values are 27 and 57, respectively.

(c)

The lower and upper critical values are 25 and 59, respectively.

(d)

As the level of significance α gets smaller, the width of the nonrejection region gets wider.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xlvii 12.28

(a)

The lower critical value is 31.

(b)

The lower critical value is 29.

(c)

The lower critical value is 27.

(d)

The lower critical value is 25.

12.29

T1 = 4 + 1 + 8 + 2 + 5 + 16 + 11 = 47

12.30

The lower and upper critical values are 40 and 79, respectively.

12.31

Decision: Since T1 = 47 is between the critical bounds of 40 and 79, do not reject H0.

12.32

(a)

The ranks for Sample 1 are 1, 2, 4, 5, and 10, respectively. The ranks for Sample 2 are 3, 6.5, 6.5, 8, 9, and 11, respectively.

(b)

T1 = 1 + 2 + 4 + 5 + 10 = 22

(c)

T2 = 3 + 6.5 + 6.5 + 8 + 9 + 11 = 44

(d)

T1  T2 

n(n  1) 11(12)   66 2 2

T1  T2  22  44  66

12.33

The lower critical value is 20.

12.34

Decision: Since T1 = 22 is greater than the lower critical bound of 20, do not reject H0.

12.35

H0: M1 = M2where Populations: 1 = traditional, 2 = experimental There is no difference in performance between the traditional and the experimental training methods. H1: M1  M2 There is a difference in performance between the traditional and the experimental training methods. Decision rule: If T1 < 78 or T1 > 132, reject H0. Test statistic: T1 = 1 + 2 + 3 + 5 + 9 + 10 + 12 + 13 + 14 + 15 = 84 Decision: Since T1 = 84 is between the critical bounds of 78 and 132, do not reject H0. There is not enough evidence to conclude that there is a difference in median performance between the traditional and the experimental training methods.

12.36

(a)

The data are ordinal.

(b)

The two-sample t test is inappropriate because the data are ordinal, the sample size is small and the distribution of the ordinal data is not normally distributed.

Copyright ©2024 Pearson Education, Inc.


xlviii Chapter 16: Time-Series Forecasting 12.36 cont.

(c)

H0: M1 = M2 where Populations: H1: M1  M2

1 = California,

2 = Washington

Data Level of Significance

0.05

Population 1 Sample Sample Size

8

Sum of Ranks

47

Population 2 Sample Sample Size

8

Sum of Ranks

89

Intermediate Calculations Total Sample Size n

16

T1 Test Statistic

47

T1 Mean

68

Standard Error of T1

9.521905

Z Test Statistic

–2.20544 Two-Tailed Test

Lower Critical Value

–1.95996

Upper Critical Value

1.959964

p-value

0.027423 Reject the null hypothesis

n  n  1

T 1  1

T  1

Z STAT 

8 16  1

2 n1n2  n  1

12 T1  T1

T

2

= 68

8  8 16  1 12

= 9.5219

= –2.2054

1

Decision: Since ZSTAT = –2.2054 is lower than the lower critical bounds of –1.96, reject H0. There is enough evidence of a significant difference in the median rating of California Cabernets and Washington Cabernets. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xlix

Copyright ©2024 Pearson Education, Inc.


l Chapter 16: Time-Series Forecasting 12.37

From PHStat Wilcoxon Rank Sum Test

Data Level of Significance

0.05

Population 1 Sample Sample Size

12

Sum of Ranks

131

Population 2 Sample Sample Size

7

Sum of Ranks

59

Intermediate Calculations Total Sample Size n

19

T1 Test Statistic

59

T1 Mean

70

Standard Error of T1

11.8322

Z Test Statistic

-0.92967

Two-Tail Test Lower Critical Value

-1.9600

Upper Critical Value

1.9600

p-Value

0.3525

Do not reject the null hypothesis (a)

Using α = 0.05, there is insufficient evidence of a difference in the median satisfaction

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems li rating for traditional and prepaid providers. Because one of the sample sizes is greater than 10, it is appropriate to use the standardized Z test statistic for the Wilcoxon rank sum test. Because ZSTAT = –0.9297 or p-value = 0.3525, do not reject H0. (b)

The Wilcoxon rank sum test assumes the two samples are independent and that the dependent variable is at the ordinal level or higher. However, the Wilcoxon rank sum test does not require that the sample data meet the normality assumption. The Wilcoxon rank sum test is an alternative to the pooled-variance and separate-variance t tests when the assumptions of these procedures are violated.

(c)

The results from (a) were similar to the results observed in 10.9 (a), which were associated with a pooled-variance t test on the same cell phone data used in 12.37 (a). The tSTAT of –1.0468 or p-value = 0.3099 led to the conclusion not to reject H0. With the Wilcoxon rank sum test, the ZSTAT = –0.92967 or p-value = 0.3525, also led to the conclusion not to reject H0. The p-value was slightly higher for the Wilcoxon rank sum test compared to the p-value associated with the pooled-variance t test from 10.9 (a).

Copyright ©2024 Pearson Education, Inc.


lii Chapter 16: Time-Series Forecasting 12.38

(a)

H0: M1 = M2, where Populations: 1 = Wing A, 2 = Wing B. H1: M1  M2. Population 1 sample: Sample size 20, sum of ranks 561 Population 2 sample: Sample size 20, sum of ranks 259 n (n  1) 20(40  1) T1  1   410 2 2 n n (n  1) 20(20)(40  1)  T1  1 2   36.9685 12 12 T1  T 1 561  410 Z STAT    4.0846 ST1 36.9685 Decision: Because ZSTAT = 4.0846 > 1.96 (or p-value = 0.0000 < 0.05), reject H0. There is sufficient evidence of a difference in the median delivery time in the two wings of the hotel.

(b)

The results of (a) are consistent with the results of Problem 10.65.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems liii 12.39

(a)

Using α = 0.05, there is evidence of a difference in the median life of bulbs between the two manufacturers. Because both sample sizes are greater than 10, it is appropriate to use the standardized Z test statistic for the Wilcoxon rank sum test. Because ZSTAT = –4.4215 or p-value = 0.0000, reject H0.

(b)

The Wilcoxon rank sum test assumes the two samples are independent and that the dependent variable is at the ordinal level or higher. However, the Wilcoxon rank sum test does not require that the sample data meet the normality assumption. The Wilcoxon rank sum test is an alternative to the pooled-variance and separate-variance t tests when one or more the assumptions of these procedures are violated.

(c)

A pooled-variance t test performed on the same data analyzed in 12.39 (a) produced the results for Problem 10.64. Because tSTAT = –5.08 or p-value = 0.000, one would reject the null hypothesis that the mean length of bulb life is the same for the two manufacturers. The Wilcoxon rank sum test produced similar results with ZSTAT = –4.4215 or p-value = 0.0000. The conclusion to reject H0 was the same both the pooled-variance t test and the Wilcoxon rank sum test.

Copyright ©2024 Pearson Education, Inc.


liv Chapter 16: Time-Series Forecasting 12.40

(a)

From PHStat Wilcoxon Rank Sum Test Data Level of Significance

0.05

Population 1 Sample Sample Size

21

Sum of Ranks

464.5

Population 2 Sample Sample Size

15

Sum of Ranks

201.5

Intermediate Calculations Total Sample Size n

36

T1 Test Statistic

201.5

T1 Mean

277.5

Standard Error of T1 Z Test Statistic

31.1649 -2.438642

Two-Tail Test Lower Critical Value

-1.9600

Upper Critical Value

1.9600

p-Value

0.0147

Reject the null hypothesis Because ZSTAT = –2.4386 < –1.96, reject H0. There is evidence to conclude that there is a difference in the median brand value between the two sectors. (b)

You must assume approximately equal variability in the two populations.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lv (c)

Using the pooled-variance t test and the separate-variance t test, the decision was to not reject the null hypothesis. Because the data is very right skewed and the variances are different, the Wilcoxon rank sum test is most appropriate.

Copyright ©2024 Pearson Education, Inc.


lvi Chapter 16: Time-Series Forecasting 12.41

(a)

H0: M1 = M2 where Populations: 1 = Bank 1, 2 = Bank 2; H1: M1  M2 Wilcoxon Rank Sum Test

Data Level of Significance

0.05

Population 1 Sample Sample Size

15

Sum of Ranks

153

Population 2 Sample Sample Size

15

Sum of Ranks

312

Intermediate Calculations Total Sample Size n

30

T1 Test Statistic

153

T1 Mean

232.5

Standard Error of T1

24.10913

Z Test Statistic

–3.29751

Two-Tailed Test Lower Critical Value

–1.95996

Upper Critical Value

1.959961

p-value

0.000976 Reject the null hypothesis

Decision rule: If ZSTAT < –1.96 or ZSTAT > 1.96, reject H0.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lvii Decision: Since ZSTAT = –3.2975 is less than the lower critical bound of –1.96, reject H0. There is enough evidence to conclude that the median waiting time between the two branches is different. (b)

You must assume approximately equal variability in the two populations.

(c)

Using both the pooled-variance t test and the separate-variance t test allowed you to reject the null hypothesis and conclude in Problem 10.12 that the mean waiting time between the two branches is different. In this test using the Wilcoxon rank sum test with largesample Z-approximation also allowed you to reject the null hypothesis and conclude that the median waiting time between the two branches is different.

Copyright ©2024 Pearson Education, Inc.


lviii Chapter 16: Time-Series Forecasting 12.42

(a)

From PHStat, Use the Wilcoxon Rank Sum Test. Level of Significance

0.05

Population 1 Sample Sample Size

28

Sum of Ranks

891

Population 2 Sample Sample Size

29

Sum of Ranks

762

Intermediate Calculations Total Sample Size n

57

T1 Test Statistic

891

T1 Mean

812

Standard Error of T1 Z Test Statistic

62.6472 1.2610308

Two-Tail Test Lower Critical Value

-1.9600

Upper Critical Value

1.9600

p-Value

0.2073

Do not reject the null hypothesis Because –1.96 < ZSTAT = 1.261 < 1.96 (or the p-value = 0.2073 > 0.05), do not reject H0. There is not enough evidence to conclude that there is a difference in the median rating of ads that play before and after halftime. (b)

You must assume approximately equal variability in the two populations.

(c)

Using the pooled-variance t test, you do not reject the null hypothesis (t = –2.004 < tSTAT = –1.3627 < 2.004; p-value = 0.1785 > 0.05) and conclude that there is insufficient evidence of a difference in the mean rating of ads before and after halftime in Problem 10.11 (a). Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lix 12.43

For the 0.01 level of significance and 5 degrees of freedom, 2  15.086 .

12.44

(a)

Decision rule: If H > 2  15.086 , reject H0.

(b)

Decision: Since Hcalc = 13.77 is less than the critical bound of 15.086, do not reject H0.

Copyright ©2024 Pearson Education, Inc.


lx Chapter 16: Time-Series Forecasting 12.45 Kruskal-Wallis Rank Test for Differences in Medians

Data Level of Significance

0.05

Intermediate Calculations Sum of Squared Ranks/Sample Size

39586.65

Sum of Sample Sizes

50

Number of Groups

5

Test Result H Test Statistic

33.29012

Critical Value

9.487729

p-Value

1.04E-06

Reject the null hypothesis

(a)

H0: M1 = M2 = M3 = M4= M5 H1: At least one of the medians differs. Since the p-value is virtually 0, reject H0. There is enough evidence of a significant difference in the median amount of food eaten among the various products.

(b)

In (a), you conclude that there is enough evidence of a significant difference in the median amount of food eaten among the various products, while in problem 11.13(a) you also conclude that there is evidence of a significant difference in the mean amount of food eaten among the various products.

(c)

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxi

The normal probability plot suggests that the kidney-based data appear to deviate from the normal distribution. Hence, the Kruskal-Wallis rank test is more appropriate.

Copyright ©2024 Pearson Education, Inc.


lxii Chapter 16: Time-Series Forecasting 12.46

Kruskal-Wallis rank test: Data Level of Significance

0.05 Group 1

Sum of Ranks

640

Sample Size

15 Group 2

Sum of Ranks

291

Sample Size

15 Group 3

Sum of Ranks

468

Sample Size

15 Group 4

Sum of Ranks

431

Sample Size

15

Intermediate Calculations Sum of Squared Ranks/Sample Size

59937.73

Sum of Sample Sizes

60

Number of groups

4

H Test Statistic

13.51716 Test Result

Critical Value

7.814728

p-Value

0.003642 Reject the null hypothesis

(a)

H0: Mmain = MSat1 = MSat2 = MSat3 H1: At least one of the medians differs. Since the p-value = 0.0036 is lower than 0.05, reject H0. There is sufficient evidence of a difference in the median waiting time in the four locations.

(b)

The results are consistent with those of Problem 11.9.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxiii 12.47

From PHStat Kruskal-Wallis Rank Test for Differences in Medians

Data

Level of Significance

0.05

Intermediate Calculations

Group

Sample Sum of Size Ranks

Mean Ranks

Burger

14

473 33.7857143

Snack

6

66

11

Sum of Squared Ranks/Sample Size

36785.21

Chicken

9

295.5 32.8333333

Sum of Sample Sizes

50

Global

6

181.5

Number of Groups

6

Sandwich

9

194 21.5555556

Pizza

6

65 10.8333333

30.25

Test Result H Test Statistic

20.1069

Critical Value

11.0705

p-Value

0.0012

Reject the null hypothesis

12.48

(a)

Because H = 20.1069 > 11.0705 or p-value = 0.0012, reject H0. At the 0.05 significance level, there is evidence of a difference in median U.S. average sales per unit among the food segments.

(b)

The results from Problem 11.11 (a) were produced by a one-way ANOVA F test. At the 0.05 level of significance, there is insufficient evidence there are differences in U.S. mean sales per unit among the four food segments. Because FSTAT = 1.3691 OR p-value = 0.2543, do not reject H0. The Kruskal-Wallis test revealed a significant difference among the food sectors.

Kruskal-Wallis rank test: Data Level of Significance

0.05

Intermediate Calculations

Copyright ©2024 Pearson Education, Inc.


lxiv Chapter 16: Time-Series Forecasting Sum of Squared Ranks/Sample Size

8705.333

Sum of Sample Sizes

30

Number of Groups

5 Test Result

H Test Statistic

19.32688

Critical Value

9.487729

p-Value

0.000678 Reject the null hypothesis

(a)

H0: MA = MB = MC = MD = ME H1: At least one of the medians differs. Since the p-value = 0.0007 < 0.05, reject H0. There is sufficient evidence of a difference in the median rating of the five advertisements.

(b)

In (a), you conclude that there is evidence of a difference in the median rating of the five advertisements, while in problem 11.10 (a), you conclude that there is evidence of a difference in the mean rating of the five advertisements.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxv 12.48 cont.

12.49

(c)

Since the combined scores are not true continuous variables, the nonparametric KruskalWallis rank test is more appropriate because it does not require the scores to be normally distributed.

From PHStat Kruskal-Wallis Rank Test for Differences in Medians

Data Sample Size

Sum of Ranks

Mean Ranks

Asia

10

223

22.3

Europe

10

318

31.8

19102.35

North America

10

114.5

11.45

Sum of Sample Sizes

40

South America

10

164.5

16.45

Number of Groups

4

Level of Significance

0.05

Intermediate Calculations Sum of Squared Ranks/Sample Size

Group

Test Result H Test Statistic

16.7733

Critical Value

7.8147

p-Value

0.0008

Reject the null hypothesis

(a)

Because H = 16.7733 > 7.815 or p-value = 0.0008, reject H0. At the 0.05 significance level, there is evidence of a difference in median congestion levels across the five continents.

(b)

The results from Problem 11.14 (a) were produced by a one-way ANOVA F test. At the 0.05 level of significance, there was evidence of differences in congestion levels among the four continents. Because FSTAT = 8.172 and a p-value = 0.003, the conclusion was to reject H0. Similarly, the Kruskal-Wallis test revealed a significant difference among the food sectors. Because H > 7.815 or p-value = 0.0008, the conclusion was to reject H0.

Copyright ©2024 Pearson Education, Inc.


lxvi Chapter 16: Time-Series Forecasting 12.50 From PHStat Kruskal-Wallis Rank Test for Differences in Medians

Data

Group

Sample Size

Europe

8

114

14.25

Americas

8

172.5

21.5625

9023.688

Asia

8

109

13.625

Sum of Sample Sizes

32

Africa

8

132.5

16.5625

Number of Groups

4

Level of Significance

0.05

Intermediate Calculations Sum of Squared Ranks/Sample Size

Sum of Ranks

Mean Ranks

Test Result H Test Statistic

3.5419

Critical Value

7.8147

p-Value

0.3154

Do not reject the null hypothesis

(a)

Because H = 3.5419 > 7.815 or the p-value is 0.3154, do not reject H0. There is insufficient evidence of a difference in the median export price across the global regions.

(b)

The results are the same. The ANOVA FSTAT also indicates that there is no difference in export costs among the four regions.

12.51

The Chi-square test for the difference between two proportions can be used as long as all expected frequencies are at least 5.

12.52

The Chi-square test can be used for c populations as long as all expected frequencies are at least one.

12.53

The Chi-square test for independence can be used as long as all expected frequencies are at least one.

12.54

The Wilcoxon rank sum test should be used when you are unable to assume that each of two independent populations are normally distributed.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxvii 12.55

The Kruskal-Wallis rank test should be used if you cannot assume that the populations are normally distributed.

Copyright ©2024 Pearson Education, Inc.


lxviii Chapter 16: Time-Series Forecasting 12.56

(a)

H0: There is no relationship between a student's gender and his/her pizzeria selection. H1: There is a relationship between a student's gender and his/her pizzeria selection. 2 2 Decision rule: d.f. = 1. If STAT > 3.841, reject H0. Test statistic: STAT = 0.412 2 Decision: Since the STAT = 0.412 is smaller than the critical bound of 3.841, do not reject H0. There is not enough evidence to conclude that there is a relationship between a student's gender and his/her pizzeria selection.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxix 12.56 cont.

(a)

(b)

2 Test statistic: STAT = 2.624 2 Decision: Since the STAT = 2.624 is less than the critical bound of 3.841, do not reject H0. There is not enough evidence to conclude that there is a relationship between a student's gender and his/her pizzeria selection.

(c)

H0: There is no relationship between price and pizzeria selection. Copyright ©2024 Pearson Education, Inc.


lxx Chapter 16: Time-Series Forecasting

12.56

(c)

2 Decision: Since the STAT = 4.956 is smaller than the critical bound of 5.991, do not reject H0. There is not enough evidence to conclude that there is a relationship between price and pizzeria selection.

cont.

(d)

12.57

H1: There is a relationship between price and pizzeria selection. 2 2 Decision rule: d.f. = 2. If STAT > 5.991, reject H0. Test statistic: STAT = 4.956

p-value = 0.0839. The probability of obtaining a sample that gives a test statistic equal to or greater than 4.956 is 0.0839 if the null hypothesis of no relationship between price and pizzeria selection is true.

PHStat Results for Facebook: Chi-Square Test

Observed Frequencies Column variable Facebook

B2B

B2C

Total

Yes

1369

2742

4111

No

169

114

283

Total

1538

2856

4394

Expected Frequencies Column variable Facebook

B2B

B2C

Total

Yes 1438.944 2672.056

4111

No

283

Total

99.05644 183.9436 1538

2856

Data Level of Significance

0.05

Number of Rows

2

Number of Columns

2

Copyright ©2024 Pearson Education, Inc.

4394


Solutions to End-of-Section and Chapter Review Problems lxxi Degrees of Freedom

1

Results Critical Value

3.841459

Chi-Square Test Statistic

81.2133

p-Value

2.03E-19

Reject the null hypothesis At the 0.05 significance level, there is evidence of a difference between B2B and B2C marketers 2 in the proportion who use Facebook. Because STAT = 81.2133 > 3.841 or p-value = 0.000, reject H0. The results indicate that a greater proportion of B2C marketers (96%) utilize Facebook compared to B2B marketers (89%).

Copyright ©2024 Pearson Education, Inc.


lxxii Chapter 16: Time-Series Forecasting 12.57 cont.

PHStat Results for Twitter: Chi-Square Test

Observed Frequencies Column variable Twitter

B2B

B2C

Total

Yes

831

1314

2145

No

707

1542

2249

Total

1538

2856

4394

Expected Frequencies Column variable Twitter

B2B

B2C

Total

Yes

750.7988 1394.201

2145

No

787.2012 1461.799

2249

Total

1538

2856

4394

Data Level of Significance

0.05

Number of Rows

2

Number of Columns

2

Degrees of Freedom

1

Results Critical Value

3.841459

Chi-Square Test Statistic

25.75197

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxxiii p-Value

3.88E-07

Reject the null hypothesis At the 0.05 significance level, there is evidence of a difference between B2B and B2C marketers 2 in the proportion who use of Twitter. Because STAT = 25.75197 > 3.841 or p-value = 0.0000, reject H0. The results indicate that a greater proportion of B2B marketers (54%) utilize Twitter compared to B2C marketers (46%) utilize Twitter.

Copyright ©2024 Pearson Education, Inc.


lxxiv Chapter 16: Time-Series Forecasting 12.57 cont.

PHStat Results for LinkedIn: Chi-Square Test

Observed Frequencies Column variable LinkedIn

B2B

B2C

Total

Yes

1246

1514

2760

No

292

1342

1634

Total

1538

2856

4394

Expected Frequencies Column variable LinkedIn

B2B

B2C

Total

Yes

966.0628 1793.937

2760

No

571.9372 1062.063

1634

Total

1538

2856

4394

Data Level of Significance

0.05

Number of Rows

2

Number of Columns

2

Degrees of Freedom

1

Results Critical Value

3.841459

Chi-Square Test Statistic

335.6029

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxxv p-Value

5.79E-75

Reject the null hypothesis

At the 0.05 significance level, there is evidence of a difference between B2B and B2C marketers 2 in the proportion who use of LinkedIn. Because STAT = 335.6029 > 3.841 or p-value =0.0000, reject H0. The results indicate that a greater proportion of B2B marketers (81%) utilize LinkedIn compared to B2C marketers (53%).

Copyright ©2024 Pearson Education, Inc.


lxxvi Chapter 16: Time-Series Forecasting 12.57 cont.

PHStat Results for Pinterest: Chi-Square Test

Observed Frequencies Column variable TikTok

B2B

B2C

Total

Yes

108

286

394

No

1430

2750

4180

Total

1538

3036

4574

Expected Frequencies Column variable TikTok

B2B

B2C

Total

Yes 132.4819 261.5181

394

No

4180

Total

1405.518 2774.482 1538

3036

4574

Data Level of Significance

0.05

Number of Rows

2

Number of Columns

2

Degrees of Freedom

1

Results Critical Value

3.841459

Chi-Square Test Statistic

7.458414

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxxvii p-Value

0.006314

Reject the null hypothesis At the 0.05 significance level, there is evidence of a difference between B2B and B2C marketers 2 in the proportion who use of Pinterest. Because STAT = 7.458414 > 3.841 or p-value = 0.0063, reject H0. The results indicate that a greater proportion of B2C marketers (10%) utilize TikTok compared to B2B marketers (7%).

Copyright ©2024 Pearson Education, Inc.


lxxviii Chapter 16: Time-Series Forecasting 12.58

From PHStat Chi-Square Test

Observed Frequencies Industry Sector Universal Bank

Leaders Level

Insurance Company

Private Equity

Private Debt Firms

Total

Yes

33

12

10

55

110

No

50

71

73

194

388

Total

83

83

83

249

498

Private Equity

Private Debt Firms

Total

Expected Frequencies Industry Sector Universal Bank

Leaders Level

Insurance Company

Yes

18.33333

18.33333

18.33333

55

110

No

64.66667

64.66667

64.66667

194

388

Total

83

83

83

249

498

Data Level of Significance

0.05

Number of Rows

2

Number of Columns

4

Degrees of Freedom

3

Results Critical Value

7.814728

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxxix Chi-Square Test Statistic p-Value

22.72971 4.6E-05

Reject the null hypothesis 2 Because STAT = 22.7297 < 7.815; p-value = 0.000 < 0.05 reject H0. There is evidence of a difference among the proportions of financial sub-sector companies that operate at the highest level (Leaders Level) of the digital maturity scale.

Copyright ©2024 Pearson Education, Inc.


lxxx Chapter 16: Time-Series Forecasting 12.59

From PHStat Chi-Square Test

Observed Frequencies Geographic Region Central Europe

Use Social Media

Western Europe

Nordic

Total

Yes

62

22

52

136

No

75

44

38

157

Total

137

66

90

293

Expected Frequencies Geographic Region Central Europe

Western Europe

Nordic

Total

Yes

63.59044

30.63481 41.77474

136

No

73.40956

35.36519 48.22526

157

Total

137

Use Social Media

66

Data Level of Significance

0.05

Number of Rows

2

Number of Columns

3

Degrees of Freedom

2

Results Critical Value

5.991465

Copyright ©2024 Pearson Education, Inc.

90

293


Solutions to End-of-Section and Chapter Review Problems lxxxi Chi-Square Test Statistic

9.287276

p-Value

0.009623

Reject the null hypothesis

(a)

At the 0.05 significance level, there is evidence of a difference in the proportion of customer service leaders who use social media in customer service among the four 2 geographic regions. Because STAT = 9.297276 > 5.991 or p-value = 0.009, reject H0.

Copyright ©2024 Pearson Education, Inc.


lxxxii Chapter 16: Time-Series Forecasting 12.59 cont.

(a)From PHStat Chi-Square Test

Observed Frequencies Geographic Region Central Europe

Offer Self-Service

Western Europe

Nordic

Total

Yes

64

46

68

178

No

73

20

22

115

Total

137

66

90

293

Nordic

Total

Expected Frequencies Geographic Region Central Europe

Offer Self-Service

Western Europe

Yes

83.22867

40.09556 54.67577

178

No

53.77133

25.90444 35.32423

115

Total

137

66

Data Level of Significance

0.05

Number of Rows

2

Number of Columns

3

Degrees of Freedom

2

Results Critical Value

5.991465 Copyright ©2024 Pearson Education, Inc.

90

293


Solutions to End-of-Section and Chapter Review Problems lxxxiii Chi-Square Test Statistic

21.80688

p-Value

1.84E-05

Reject the null hypothesis (b)

At the 0.05 significance level, there is evidence of a difference in the proportion of customer service leaders who offer self-service options for customers among the four 2 geographic regions. Because STAT = 21.80688 > 5.991 or p-value = 0.000, reject H0.

Chapter 13

13.1

(a) (b)

(d)

When X = 0, the estimated expected value of Y is 3. For each increase in the value X by 1 unit, you can expect an increase by an estimated 6 units in the value of Y. Yˆ  3  6 X  3  6(3)  21 Yˆ  3  6 X  3  6(1)  9

13.2

(a) (b) (c) (d)

yes no no yes

13.3

(a) (b)

When X = 0, the estimated expected value of Y is 16. For each increase in the value X by 1 unit, you can expect a decrease in an estimated 0.5 units in the value of Y. Yˆ  16  0.5 X  16  0.5(6)  13

(c)

(c) 13.4

(a)

Copyright ©2024 Pearson Education, Inc.


lxxxiv Chapter 16: Time-Series Forecasting

The scatter plot shows a positive linear relationship. (b) (c) (d)

For each % increase in alcohol content, there is an expected increase in quality of an estimated 0.5624. Yˆ  0.3529  0.5624 X  0.3529  0.5624 10 = 5.2715

There appears to be a positive linear relationship between quality and % alcohol content. For each % increase in alcohol content, there is an expected increase in quality of an estimated 0.5624.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxxxv 13.5

(a)

The scatterplot depicts a positive linear relationship between summary rating and cost. (b)

b0 = –47.8091, b1 = 1.5951.

(c)

The Y intercept, b0, would be the mean cost per person when the summary rating is equal to zero. The literal interpretation is not meaningful in this case because a negative price and a mean summary rating of zero would not be possible. The sample slope, b1, indicates that for each unit of change in summary rating, the predicted mean cost per person increases by $1.60.

(d)

Ŷ  47.8091  1.5951X  47.8091  1.5951(55)  39.92 . The predicted mean cost per person for a restaurant with a summary rating of 55 would be $39.92.

(e)

One could inform the owners that there is a positive relationship between summary rating for food, décor, and service and the average cost of a meal per person. This relationship suggests that higher cost per person is associated with higher ratings with food, décor, and service.

Copyright ©2024 Pearson Education, Inc.


lxxxvi Chapter 16: Time-Series Forecasting 13.6

(a)

(b) (c) (d) (e) 13.7

b0 = 42,989.39135680, b1 = 1.60307520. For each increase of $1,000 in tuition, the mean starting salary is predicted to increase by $1,603.10. Yˆ  42,939.39135680  1.60307520(50, 450)  $123,864.54 . A program that has a peryear tuition cost of $50,450 is predicted to have a mean starting salary of $123,864.54. Starting salary seems higher for those schools that have a higher tuition.

(a)

(b) (c)

Yˆ  0.7500  0.5000 X For each increase of one additional plate gap, the estimated mean tear rating will increase by 0.5. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxxxvii 13.7 cont.

(d) (e)

13.8

(a)

(b) (c)

(d) (e) 13.9

Yˆ  0.7500  0.5000 X  0.7500  0.5000  0   0.7500 Bag tear rating increases as the plate gap on the bag-sealing equipment increases.

b0 = –1,410.6988, b1 = 10.9927. For each additional million dollar increase in revenue, the mean value is predicted to increase by an estimated $10.9927 million. Literal interpretation of b0 is not meaningful because an operating franchise cannot have zero revenue. Yˆ  1, 410.6998  10.9927(250)  $1,337.47 million. That the value of the team can be expected to increase as revenue increases.

(a)

Copyright ©2024 Pearson Education, Inc.


lxxxviii Chapter 16: Time-Series Forecasting

13.9 cont.

(b)

(c)

b0 = 782 and b1 = 0.978. The Y intercept, b0, would be the mean monthly rental cost when the square footage of an apartment is equal to zero. The literal interpretation is not meaningful in this case because the square footage cannot be equal to zero. The sample slope, b1, indicates that for each Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxxxix unit of change in apartment square footage, the mean monthly rental cost is predicted to increase by $0.978. (d)

13.9 cont.

(d)

Y  782  0.978(800)  $1,564.55. The predicted mean monthly rent for a 800 square foot apartment would be $1,564.55.

(e)

(f)

13.10

It would not be appropriate to use the model 9 (d) to predict the monthly rent for a 1,500 square foot apartment because the model was based on an independent square footage variable that ranged from 434 square feet to 955 square feet. A 1,500 square foot apartment would fall outside of this range. One should only use the relevant range of the independent variable. The 800 square foot apartment for $1,130 would represent the better deal. Based on the equation in 13.9 (d), a 800 square foot apartment would have an estimated monthly rent of $1,564. A 800 square foot apartment renting at $1,130 per month would be $434.55 below the predicted rent of $1,564.55. Based on the equation in 13.9 (d), a 830 square foot apartment would have a predicted rent of $1,593.90. A 830 apartment renting at $1410 would only be $183.90 below the predicted amount of $1,593.90.

(a)

Copyright ©2024 Pearson Education, Inc.


xc Chapter 16: Time-Series Forecasting

(b) (c) (d) (e)

b0 = –5.6263, b1 = 1.3712. For each increase of one million YouTube trailer views, the predicted weekend box office gross is estimated to increase by $1.3712 million. Yˆ  5.6263  1.3712(20)  $21.7969 million. You can conclude that the mean predicted increase in weekend box office gross is $1.3712 million for each one million increase in YouTube trailer views.

13.11

83% of the variation in the dependent variable can be explained by the variation in the independent variable.

13.12

SST = 40 and r2 = SSR/SST = 36/40 = 0.90. So, 90% of the variation in the dependent variable can be explained by the variation in the independent variable.

13.13

r2 = SSR/SST = 77/99 = 0.7778. So, 77.78% of the variation in the dependent variable can be explained by the variation in the independent variable.

13.14

SST = SSR + SSE = 30 + 10 = 40 and r2 = SSR/SST = 30/40 = 0.75. So, 75% of the variation in the dependent variable can be explained by the variation in the independent variable.

13.15

Since SST = SSR + SSE and since SSE cannot be a negative number, SST must be at least as large as SSR.

13.16

(a)

(b) (c)

SSR = 0.3417. So, 34.17% of the variation in wine quality can be explained by the SST variation in alcohol content. 42.1323 SYX   0.9369 48 Based on (a) and (b), the model should be moderately useful for predicting wine quality. r2 =

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xci 13.17

From PHStat, Restaurant: Cost vs Summary Rating Regression Statistics Multiple R

0.5915

R Square

0.3499

Adjusted R Square

0.3433

Standard Error

16.9229

Observations

100

(a)

(b) (c)

r2 = 0.3499, which means that 34.99% of the variation in the dependent variable can be explained by the variation in the independent variable. In this case, 34.99% of the variation in the cost of meal per person can be explained by the variation in summary customer rating of food, décor, and service. SYX = 16.9229. Based on 13.17 (a) and (b), the typical difference between the actual meal cost per person and the amount predicted cost based on the regression model using summary rating as the independent variable, is approximately $16.92. The model is weak to moderate in its usefulness in predicting the cost of a restaurant meal.

Copyright ©2024 Pearson Education, Inc.


xcii Chapter 16: Time-Series Forecasting 13.18

From PHStat, FTMBA: Mean Starting Salary and Bonus vs Pre-year Tuition Regression Statistics Multiple R

0.6809

R Square

0.4637

Adjusted R Square

0.4474

Standard Error

25753.6529

Observations

35

(a) (b) (c)

r2 = 0.46.37. 46.37% of the variation in starting salary can be explained by the variation in tuition. SYX = 25,753.6529. Based on (a) and (b), the model should be very useful for predicting the starting salary.

13.19

(a) (b)

r2 = 0.3811. So, 38.11% of the variation in tear rating can be explained by the variation in the plate gap. SYX  1.0241 Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xciii (c)

Based on (a) and (b), the model should be somewhat useful for predicting the tear rating.

Copyright ©2024 Pearson Education, Inc.


xciv Chapter 16: Time-Series Forecasting 13.20

From PHStat, MLB Values: Current Value vs Annual Revenue Regression Statistics Multiple R

0.8293

R Square

0.6877

Adjusted R Square

0.6766

Standard Error

654.9821

Observations

30

(a) (b) (c)

13.21

r2 = 0.6877. 68.77% of the variation in the value of a MLB baseball team can be explained by the variation in its annual revenue. SYX = 654.9821 Based on (a) and (b), the model should be very useful for predicting the value of a baseball team.

(a)

r2 = 0.2500, which means that 25% of the variation in the dependent variable can be explained by the variation in the independent variable. In this case, 25% of the variation in the monthly rent cost can be explained the variation in square footage of an apartment. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xcv (b) (c)

SYX = 198.486. Based on 13.21 (a) and (b), the typical difference between the actual monthly rent and the amount predicted based on the regression model using apartment square footage as the independent variable, is approximately $198.49. The model is relatively weak in its useful in predicting the monthly rent of an apartment. Other variables that might explain the variation in monthly rent could include the following: the age of the apartment, condition of the apartment, crime rate, amenities, and the average income level in the local area.

13.21 cont.

(d)

13.22

From PHStat, Movie: YouTube Trailer Views vs Opening Weekend Gross Regression Statistics Multiple R

0.9139

R Square

0.8352

Adjusted R Square

0.8328

Standard Error

15.9010

Observations

71

(a) (b) (c) (d)

r2 = 0.8352. 83.52% of the variation in weekend box office gross can be explained by the variation in YouTube trailer views. SYX = 15.9010. Based on (a) and (b), the model should be useful for predicting weekend box office gross. Other variables that might explain the variation in weekend box office gross could be the amount spent on advertising, the timing of the release of the movie, and the type of movie.

13.23

The residual plot reveals no evidence of a pattern in the residuals. The plot reveals that the residuals, the difference between the observed value of Y and the predicted value of Y, are spread above and below 0 for different values of X. There appears to be no violation of the linearity and equal variance assumptions.

13.24

A residual analysis of the data indicates a pattern, with sizable clusters of consecutive residuals that are either all positive or all negative. This pattern indicates a violation of the assumption of linearity. A curvilinear model should be investigated.

Copyright ©2024 Pearson Education, Inc.


xcvi Chapter 16: Time-Series Forecasting 13.25

The residual plot reveals no evidence of a pattern in the residuals. The plot reveals that the residuals, the difference between the observed value of Y and the predicted value of Y, are spread above and below 0 for different values of summated rating. There appears to be no violation of the linearity and equal variance assumptions.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xcvii 13.26

Based on the residual plot, there does not appear to be a pattern in the residual plot. There is no apparent violation of the linearity and equal variance assumptions.

The normal probability plot suggests possible departure from the normality assumption.

Copyright ©2024 Pearson Education, Inc.


xcviii Chapter 16: Time-Series Forecasting 13.27

The residual plot reveals that the equal variance assumption is most likely violated. The linearity assumption may also have been violated. A linear fit does not appear to be adequate.

The normal probability plot suggests possible departure from the normality assumption.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xcix 13.28

Based on the residual plot, the assumption of equal variance may be violated since there are two large positive residuals for lower salaries.

Copyright ©2024 Pearson Education, Inc.


c Chapter 16: Time-Series Forecasting 13.29

The plot of residuals versus the independent variable, square footage, reveals no evidence of a pattern in the residuals. The plot reveals that the residuals, the difference between the observed value of Y and the predicted value of Y, are spread above and below 0 for different values of summated rating. There appears to be no violation of the linearity and equal variance assumptions. The normal probability plot reveals no evidence of substantial departure from the normality assumption.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ci 13.30

Based on the residual plot, there is no evidence of a pattern. There is no evidence of a violation of the linearity and equal variance assumptions.

Copyright ©2024 Pearson Education, Inc.


cii Chapter 16: Time-Series Forecasting 13.31

The plot of residuals versus the independent variable, YouTube trailer views, reveals evidence of a pattern in the residuals. The plot reveals that the residuals, the difference between the observed value of Y and the predicted value of Y, are tightly grouped around 0 for lower values of X, but the variability of the residuals increases substantially as values of X increase. The equal-variance assumption is not valid, indicating a need to use alternative statistical approaches.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ciii 13.32

(a) Residual Plot

Residuals

10 5 0 -5 0

2

4

6

8

10

-10

Time Period

(b) 13.33

(a)

(b) (c) 13.34

An increasing linear relationship exists. There appears to be strong positive autocorrelation among the residuals.

(a) (b)

There is no apparent pattern in the residuals over time. D = 1.574 > 1.36. There is no evidence of positive autocorrelation among the residuals. The data are not positively autocorrelated. No, it is not necessary to compute the Durbin-Watson statistic since the data have been collected for a single period for a set of bags. If a single bag-sealing equipment was studied over a period of time and the amount of plate gap varied over time, computation of the Durbin-Watson statistic would be necessary.

Copyright ©2024 Pearson Education, Inc.


civ Chapter 16: Time-Series Forecasting 13.35

(a)

(b)

From PHStat, Oil and Gasoline, Gasoline vs Crude Oil Calculations b1, b0 Coefficients

0.0332

-0.1248

Y i  0.1248  0.0332  X i 

(c)

The sample slope, b1, indicates that for each unit of change in price of barrel of crude oil, the predicted mean price for a gallon of gas increases by $0.0332.

(d)

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cv

13.35 cont.

(d)

The residual plot versus observation order reveals a cyclical pattern across the order of observations.

(e) Durbin-Watson Statistic (f) (g)

(h)

13.36

(a)

(b) (c)

0.2678

Because D = 0.2678 < 1.69, one can conclude there is evidence of a positive autocorrelation among the residuals. Based on the results of (d) through (f), there is substantial reason to question the validity of the model. Because of the violation of the independence-of-errors assumption, alternative approaches should be used. The regression equation reveals a strong positive relationship between the price of a barrel of crude oil and the price of a gallon of gas. Because the data were collected over 306 weeks and there was a clear cyclical pattern in residuals across observation order, alternative statistical approaches should be used.

SSXY 201399.05   0.0161 SSX 12495626 b0 = Y  b1 X  71.2621  0.0161 4393 = 0.458 Yˆ  0.458  0.0161X  0.458  0.0161(4500)  72.908 or $72,908 b1 =

Copyright ©2024 Pearson Education, Inc.


cvi Chapter 16: Time-Series Forecasting

n

e  e 

(d)

D = i2

n

e i 1

(e) (f)

i 1

i

2 i

2

1243.2244 = 2.08 > 1.45. 599.0683

There is no evidence of positive autocorrelation among the residuals. Based on a residual analysis, the model appears to be adequate. It appears that the number of orders affects the monthly distribution costs.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cvii 13.37

(a) (b) (c)

Yˆ  17.0833  5 X Yˆ  17.0833  5  0.5  14.5833 seconds

There is no noticeable pattern in the plot. (d)

H0: There is no autocorrelation. H1: There is positive autocorrelation. PHStat output: Durbin-Watson Calculations Sum of Squared Difference of Residuals Sum of Squared Residuals Durbin-Watson Statistic

(e) (f)

238.4375 138.3333333 1.723644578

dL = 1.27, dU = 1.45. Since D = 1.7236 > 1.45, do not reject H0. There is no evidence of autocorrelation. Based on the results of (c) and (d), there is no reason to question the validity of the model. It appears that the larger the tamping distance, the shorter is the time of separation.

Copyright ©2024 Pearson Education, Inc.


cviii Chapter 16: Time-Series Forecasting 13.38

(a) (b) (c)

b0 = –2.535, b1 = 0.060728 Yˆ  2.535  0.060728 X  2.535  0.060728(83)  2.5054

(d) (e)

D = 1.64>1.42. There is no evidence of positive autocorrelation among the residuals. The plot of the residuals versus time period shows some clustering of positive and negative residuals for intervals in the domain, suggesting a nonlinear model might be better. Otherwise, the model appears to be adequate. There appears to be a positive relationship between sales and atmospheric temperature.

(f) 13.39

13.40

(b) (c)

H0 :   0 H1 :   0 r 0.75  0 = 4.5356 tSTAT   1 r2 1  0.752 n2 18  2 d.f. = 16, lower critical value = –2.1199, upper critical value = 2.1199. Since t = 4.5356 is greater than the upper critical value of 2.1199, reject H0.

(a)

H 0 : 1  0

(a)

H1 : 1  0

Test statistic: tSTAT   b1  0 / sb1  4.5 / 1.5  3.00 (b) (c) (d)

With n = 18, df = 18 – 2 =16, t0.05/2  2.1199 . Reject H0. There is evidence that the fitted linear regression model is useful. b1  t0.05/2 sb1  1  b1  t0.05/2 sb1 4.5  2.1199(1.5)  1  4.5  2.1199(1.5) 1.32  1  7.68

13.41

(a)

(b)

MSR  SSR / k  72 / 1  72

MSE  SSE / (n  k  1)  44 / 18  2.444 FSTAT  MSR / MSE  72 / 2.444  29.4545 F0.05  4.41 Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cix 13.41

(c)

cont.

(d) (e)

Reject H0. There is evidence that the fitted linear regression model is useful. SSR 72 r2    0.6207 r   0.6207  0.7878 SST 116 There is no correlation between X and Y. H0 :   0 There is correlation between X and Y. H1 :   0 d.f. = 18.

Decision rule: Reject H 0 if tSTAT > 2.1009. r

0.7878  5.4272 . 1  0.6207 1 r 18 n2 Since tSTAT  5.4272 is less than the lower critical bound of –2.1009, reject H 0 . There is enough evidence to conclude that the correlation between X and Y is significant.

Test statistic: tSTAT 

13.42

(a)

2

H 0 : 1  0

H1 : 1  0

Intercept alcohol

Coefficients Standard Error t Stat P-value -0.3529 1.2000 -0.2941 0.7700 0.5624 0.1127 4.9913 0.0000

Lower 95% Upper 95% -2.7656 2.0599 0.3359 0.7890

b  1 = 4.9913 with a p-value = 0.0000 < 0.05. Reject H0. There is enough tSTAT  1 Sb1

(b) 13.43

(a)

evidence to conclude that the fitted linear regression model is useful. b1  t /2 Sb1 0.3359  1  0.7890 H 0 : 1  0 From PHStat

H1 : 1  0

b1, b0 Coefficients

1.5951

-47.8091

ANOVA df

SS

MS

F

Significance F

Regression

1

15104.9827 15104.9827 52.7440

Residual

98

28065.5273

Total

99

43170.5100

Intercept Summary Rating

0.0000

286.3829

Coefficients

Standard Error

P-value

Lower 95%

Upper 95%

t Stat

-47.8091

14.0423

-3.4046

0.0010

-75.6756

-19.9426

1.5951

0.2196

7.2625

0.0000

1.1592

2.0309

Copyright ©2024 Pearson Education, Inc.


cx Chapter 16: Time-Series Forecasting

(b)

At the 0.05 significance level, there is evidence of linear relationship between summated rating and the cost of a meal. Because tSTAT = 7.2625 > 1.9856 or p-value = 0.0000 < 0.05, reject H0. b ± tα/2Sb1 = 1.5951 ± 1.9845(0.2196). Thus, 1.1592 ≤ β1 ≤ 2.0309.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxi 13.44

(a)

H 0 : 1  0 From PHStat

H1 : 1  0

b1, b0 Coefficients

1.6031 42989.3914

ANOVA df

SS

Significance F

1 18922315884.4426 18922315884.4426 28.5297

Residual

33 21887271048.2431

Total

34 40809586932.6857

Intercept

Standard Error

42989.3914 18770.3449

Per-Year Tuition

(b)

1.6031

0.3001

0.0000

663250637.8255

t Stat

P-value

Lower 95%

2.2903

0.0285

4800.8375 81177.9453

5.3413

0.0000

0.9925

Upper 95%

2.2137

At the 0.05 significance level, there is evidence of linear relationship between tuition and starting salary. Because tSTAT = 5.3413 > 2.0345 or p-value = 0.0000 < 0.05, reject H0. b ± tα/2Sb1 = 1.6031 ± 2.0345(0.3001). Thus, 0.9925 ≤ β1 ≤ 2.2137.

(a) Coefficients Standard Error t Stat P-value 0.7500 0.2349 3.1922 0.0053 0.5000 0.1545 3.2356 0.0049

Intercept Plate Gap

(b) 13.46

F

Regression

Coefficients

13.45

MS

(a)

Lower 95% Upper 95% 0.2543 1.2457 0.1740 0.8260

p-value = 0.0049< 0.05. Reject H0. There is evidence that the fitted linear regression model is useful. b1  t /2 Sb1 0.1740  1  0.8260 H 0 : 1  0 From PHStat

H1 : 1  0

b1, b0 Coefficients

10.9927

-1410.6988

ANOVA df Regression

1

SS 26454210.9591

MS

F

26454210.9591 61.6646

Copyright ©2024 Pearson Education, Inc.

Significance F 0.0000


cxii Chapter 16: Time-Series Forecasting Residual

28

12012043.2076

Total

29

38466254.1667

Intercept Annual Revenue

(b) 13.47

429001.5431

Coefficients

Standard Error

t Stat

P-value

Lower 95%

Upper 95%

-1410.6988

461.6593

-3.0557

0.0049

-2356.3650

-465.0326

10.9927

1.3999

7.8527

0.0000

8.1252

13.8602

At the 0.05 significance level, there is evidence of linear relationship between annual revenue and current value. Because tSTAT = 7.8527 > 2.0484 or p-value = 0.0000 < 0.05, reject H0. b ± tα/2Sb1 = 10.9927 ± 2.0484(1.3999). Thus, 8.1252 ≤ β1 ≤ 13.8602.

(a)

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxiii

13.48

(b) (a)

At the 0.05 significance level, there is evidence of linear relationship between summated rating and the cost of a meal. Because tSTAT = 4.28 or p-value = 0.000, reject H0. b ± tα/2Sb1 = 0.978 ± 2.0040(0.228). 0.521088 ≤ β1 ≤ 1.434912. H 0 : 1  0 H1 : 1  0 From PHStat

b1, b0 Coefficients

1.3712

-5.6263

ANOVA df

SS

MS

88415.4136 349.6847

Regression

1

88415.4136

Residual

69

17446.1839

Total

70

105861.5975

Coefficients

Standard Error

F

Significance F 0.0000

252.8432

t Stat

P-value

Copyright ©2024 Pearson Education, Inc.

Lower 95%

Upper 95%


cxiv Chapter 16: Time-Series Forecasting Intercept

-5.6263

2.4916

-2.2581

0.0271

-10.5968

-0.6558

YouTube Trailer Views

1.3712

0.0733

18.6999

0.0000

1.2249

1.5174

(b) 13.49

At the 0.05 significance level, there is evidence of linear relationship between YouTube trailer views and opening weekend box office gross. Because tSTAT = 18.6999 > 1.9949 or p-value = 0.0000 < 0.05, reject H0. b ± tα/2Sb1 = 1.3712 ± 1.9949(0.0733). Thus, 1.2249 ≤ β1 ≤ 1.5174.

(a)

Alphabet, Inc. moves 5% more than the overall market and is more volatile than the market. Amazon.com, Inc. moves 23% more than the overall market and is more volatile than the market. Apple moves 25% more than the overall market and is more volatile than the market. Hilton Worldwide moves 21% more than the overall market and is more volatile than the market. Marriot Intl. moves 58% more than the overall market and is more volatile than the market. Microsoft moves only 92% as much as the overall market and is less volatile than the market. Pfizer moves only 64% as much as the overall market and is less volatile than the market. Tesla, Inc. moves 97% more than the overall market and is more volatile than the market. TORM moves only 19% as much as the overall market and is less volatile than the market. Walt Disney Co. moves 25% more than the overall market and is more volatile than the market.

(b)

Investors can use the beta value to assess the risk of a stock relative to the overall market. From the list, an investor looking for growth should probably avoid TORM and Pfizer.

13.50

(a) (b) (c) (d)

(% daily change in DXRLX) = b0 + 1.75 (% daily change in Russell 2000 index). If the S&P 500 gains 10% in a year, DXNLX is expected to gain an estimated 17.5%. If the S&P 500 loses 20% in a year, DXNLX is expected to lose an estimated 35%. Risk takers will be attracted to leveraged funds, and risk-averse investors will stay away.

13.51

(a)

r = 0.8391. There appears to be a strong positive linear relationship between calories and sugar (in grams). t = 3.4496, p-value = 0.0183 < 0.05. Reject H0. At the 0.05 level of significance, there is enough evidence of a significant linear relationship between calories and sugar (in grams).

(b)

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxv 13.52

13.53

(a)

First weekend gross and U.S. gross r = 0.7601. There appears to be a strong positive linear relationship. First weekend gross and the worldwide gross r = 0.8596. There appears to be a strong positive linear relationship. U.S. gross, and worldwide gross r = 0.9448. There appears to be a very strong positive linear relationship.

(b)

First weekend gross and the U.S. gross Since tSTAT = 3.3083 > 2.306 and p-value = 0.0107 < 0.05, reject H0. At the 0.05 level of significance, there is evidence of a linear relationship between first weekend sales and U.S. gross. First weekend gross and the worldwide gross Since tSTAT = 4.7574 > 2.306 and p-value = 0.0014 < 0.05, reject H0. At the 0.05 level of significance, there is evidence of a linear relationship between first weekend sales and worldwide gross. U.S. gross, and worldwide gross Since tSTAT = 8.1552 > 2.306 and p-value = 0.0000 < 0.05, reject H0. At the 0.05 level of significance, there is evidence of a linear relationship between U.S. gross and worldwide gross.

(a)

From PHStat, Download speed and Upload speed Regression Statistics Multiple R

Intercept Upload Speed

13.54

0.2463 Coefficients

Standard Error

t Stat

P-value

543.9337

449.0916

1.2112

0.2714

2.6111

4.1945

0.6225

0.5565

(b)

For download and upload speeds, r = 0.2463. There appears to be a positive linear relationship. At the 0.05 significance level, there is insufficient evidence of a significant linear relationship between download and upload speed. Because tSTAT = 0.6225 < 2.4469 or p-value = 0.5565 < 0.05, do not reject H0.

(a)

From PHStat, Value and Payroll Regression Statistics Multiple R

0.6766 Coefficients

Standard Error

t Stat

P-value

Intercept

0.2485

0.4066

0.6111

0.5460

Payroll

0.0124

0.0026

4.8616

0.0000

Copyright ©2024 Pearson Education, Inc.


cxvi Chapter 16: Time-Series Forecasting Value and payroll r = 0.6766. There appears to be a strong positive linear relationship.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxvii 13.54 cont.

(a)

From PHStat, Payroll and Wins Regression Statistics Multiple R

0.4406 Coefficients

Standard Error

t Stat

P-value

Intercept

-6.1646

59.9545

-0.1028

0.9188

Wins

1.8943

0.7293

2.5974

0.0148

Payroll and Wins r = 0.4406. There appears to be a moderate positive linear relationship.

13.55

(b)

Value and payroll Because tSTAT = 4.8616 > 2.0484 or p-value = 0.0000 < 0.05, reject H0. At the 0.05 level of significance, there is significant evidence of a linear relationship between team value and payroll.

(c)

Because tSTAT = 2.5974 > 2.0484 or p-value = 0.0148 < 0.05, reject H0. At the 0.05 level of significance, there is significant evidence of a linear relationship between payroll and wins.

(a)

When X = 2, Yˆ  5  3 X  5  3(2)  11

h

( X  X )2 1 1 (2  2)2  n i    0.05 n 20 20 2 ( Xi  X ) i 1

(b)

95% confidence interval: Yˆ  t0.05/2 sYX h  11  2.1009 1  0.05 10.53  YX  11.47 s 1  h  11  2.1009 1  1.05 95% prediction interval: Yˆ  t 0.05/2 YX

8.847  YI  13.153

13.56

(a)

When X = 4, Yˆ  5  3 X  5  3(4)  17

( X i  X )2 1 1 (4  2)2 h  n    0.25 n 20 20 2 ( X  X )  i i 1

95% confidence interval: Yˆ  t0.05/2 sYX h  17  2.1009  1  0.25 15.95  Y | X  4  18.05 (b) (c)

95% prediction interval: Yˆ  t0.05/2 sYX 1  h  17  2.1009  1  1.25 14.651  YX  4  19.349 The intervals in this problem are wider than the intervals in Exercise 13.55 because the value of X is farther from X . Copyright ©2024 Pearson Education, Inc.


cxviii Chapter 16: Time-Series Forecasting

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxix 13.57

From PHStat, Predict cost of a meal Confidence Interval Estimate Data X Value

50

Confidence Level

95%

Intermediate Calculations Sample Size

100

Degrees of Freedom

98

t Value

1.984467

XBar, Sample Mean of X

63.47

Sum of Squared Differences from XBar

5936.91

Standard Error of the Estimate

16.92285

h Statistic

0.040562

Predicted Y (YHat)

31.9444 For Average Y

Interval Half Width

6.7635

Confidence Interval Lower Limit

25.1809

Confidence Interval Upper Limit

38.70795

For Individual Response Y Interval Half Width

34.2572

Prediction Interval Lower Limit

-2.3128

Prediction Interval Upper Limit

66.20157

(a)

25.1809  Y X 50  38.70795. The 95% confidence interval estimate is that the

population mean cost of a meal is between $25.1809 and $38.70795 for restaurants that have a summary rating of 50. Copyright ©2024 Pearson Education, Inc.


cxx Chapter 16: Time-Series Forecasting

(c)

2.3128  YX 50  66.20157. The 95% confidence interval estimate is that the cost of a meal for an individual restaurant with a summary rating of 50 is between $0 and $66.20157. 13.57 (a) represents a confidence interval estimate for the mean value among all restaurants in the study while 13.57 (b) represents a prediction interval for an individual restaurant. Because there is much more variation in predicting an individual value compared estimating a mean value, the prediction interval for the individual restaurant is much wider than the confidence interval estimate for the mean.

(a)

4.9741  Y | X 10  5.568984094

(b) (c)

3.3645  YX 10  7.178609919 Part (b) provides a prediction interval for the individual response given a specific value of the independent variable, and part (a) provides an interval estimate for the mean value, given a specific value of the independent variable. Because there is much more variation in predicting an individual value than in estimating a mean value, a prediction interval is wider than a confidence interval estimate.

(a)

0.2543  Y | X 0  1.2457

(b) (c)

1.4668  YX 0  2.9668 Part (b) provides an interval prediction for the individual response given a specific value of the independent variable, and part (a) provides an interval estimate for the mean value given a specific value of the independent variable. Since there is much more variation in predicting an individual value than in estimating a mean value, a prediction interval is wider than a confidence interval estimate holding everything else fixed.

(b)

13.58

13.59

13.60

From PHStat, Predict the starting salary Confidence Interval Estimate

Data X Value

50450

Confidence Level

95%

Intermediate Calculations Sample Size

35

Degrees of Freedom

33

t Value

2.0345153

XBar, Sample Mean of X

60836.1143

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxxi Sum of Squared Differences from XBar

7363198352

Standard Error of the Estimate

25753.6529

h Statistic

0.0432215

Predicted Y (YHat)

123864.535

For Average Y Interval Half Width

10893.0553

Confidence Interval Lower Limit

112971.4797

Confidence Interval Upper Limit

134757.59

For Individual Response Y Interval Half Width

53516.5443

Prediction Interval Lower Limit

70347.9907

Prediction Interval Upper Limit

177381.079

(a)

$112,971.48  Y X 50,450  $134,757.59

(b) (c)

$70,347.99  YX 50,450  $177,381.08 You can estimate a mean more precisely than you can predict a single observation.

Copyright ©2024 Pearson Education, Inc.


cxxii Chapter 16: Time-Series Forecasting 13.61

(a)

(b) (c)

$1,499.31  Y X 800  $1,629.79 The 95% confidence interval estimate is that the

population mean cost for all one-bedroom apartments that are 800 square feet is between $1,499.31 and $1,629.79. $1,161.46  YX 800  $1,967.64 The 95% confidence interval estimate is that the cost of an individual one-bedroom 800 square foot apartment is between $1,161.46 and $1,967.64. 13.61 (a) represents a confidence interval estimate for the mean value among all 800 square foot apartments while 13.61 (b) represents a prediction interval for an individual 800 square foot apartment. Because there is much more variation in predicting an individual value compared estimating a mean value, the prediction interval for the individual apartment is much wider than the confidence interval estimate for the mean.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxxiii 13.62

From PHStat, Predict the mean value Confidence Interval Estimate

Data X Value

250

Confidence Level

95%

Intermediate Calculations Sample Size

30

Degrees of Freedom

28

t Value

2.048407

XBar, Sample Mean of X

318.5333

Sum of Squared Differences from XBar

218921.5

Standard Error of the Estimate

654.9821

h Statistic

0.054788

Predicted Y (YHat)

1337.469

For Average Y Interval Half Width

314.0416

Confidence Interval Lower Limit

1023.4273

Confidence Interval Upper Limit

1651.511

For Individual Response Y Interval Half Width

1377.9334

Prediction Interval Lower Limit

-40.4645

Prediction Interval Upper Limit

2715.402

(a)

1,023.4273  Y X 250  1,651.511

(b) (c)

40.4645  YX 250  1,377.9334 Because there is much more variation in predicting an individual value than in estimating a mean, the prediction interval is wider than the confidence interval. Copyright ©2024 Pearson Education, Inc.


cxxiv Chapter 16: Time-Series Forecasting

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxxv 13.63

From PHStat Calculations b1, b0 Coefficients (a) (b) (c)

1.3712

-5.6263

Y  5.6263  1.3712(50)  $62.93 million. The predicted weekend gross for the movie with 50 million views would be $62.93 million. Because this example focuses on one individual movie, the prediction interval for an individual response is more appropriate. From PHStat

Confidence Interval Estimate

Data X Value

50

Confidence Level

95%

For Average Y Interval Half Width

5.5430

Confidence Interval Lower Limit

57.3887

Confidence Interval Upper Limit

68.47471

For Individual Response Y Interval Half Width

32.2024

Prediction Interval Lower Limit

30.7294

Prediction Interval Upper Limit

95.13409

$30.7294 million  YX 50  $95.13409 million. The 95% confidence interval estimate weekend box office gross for an individual move that had 50 million You Tube trailer views would be between $30.7294 million and $95.13409 million. Because there is much more variation in predicting an individual value compared estimating a mean value, the prediction interval for the weekend box office gross for an individual movie is much wider than the confidence interval estimate for the mean of all movies that had 50 million You Tube trailer views. 13.64

The slope of the line, b1, represents the estimated expected change in Y per unit change in X. It represents the estimated mean amount that Y changes (either positively or negatively) for a Copyright ©2024 Pearson Education, Inc.


cxxvi Chapter 16: Time-Series Forecasting particular unit change in X. The Y intercept b0 represents the estimated mean value of Y when X equals 0. 13.65

The coefficient of determination measures the proportion of variation in Y that is explained by the independent variable X in the regression model.

13.66

The unexplained variation or error sum of squares (SSE) will be equal to zero only when the regression line fits the data perfectly and the coefficient of determination equals 1.

13.67

The explained variation or regression sum of squares (SSR) will be equal to zero only when there is no relationship between the Y and X variables, and the coefficient of determination equals 0.

13.68

Unless a residual analysis is undertaken, you will not know whether the model fit is appropriate for the data. In addition, residual analysis can be used to check whether the assumptions of regression have been seriously violated.

13.69

The assumptions of regression are normality of error, homoscedasticity, and independence of errors.

13.70

The normality of error assumption can be evaluated by obtaining a histogram, boxplot, and/or normal probability plot of the residuals. The homoscedasticity assumption can be evaluated by plotting the residuals on the vertical axis and the X variable on the horizontal axis. The independence of errors assumption can be evaluated by plotting the residuals on the vertical axis and the time order variable on the horizontal axis. This assumption can also be evaluated by computing the Durbin-Watson statistic.

13.71

If the data in a regression analysis has been collected over time, then the assumption of independence of errors needs to be evaluated using the Durbin-Watson statistic.

13.72

The confidence interval for the mean response estimates the mean response for a given X value. The prediction interval estimates the value for a single item or individual.

13.73

(a)

From PHStat Coefficients

Standard Error

P-value

Lower 95%

Upper 95%

t Stat

Intercept

1.3759

10.8281

0.1271

0.9012

-22.4566

25.2084

Tomatometer Rating

0.0363

0.1363

0.2665

0.7948

-0.2636

0.3362

b0 = 1.3759, b1 = 0.0363 (b)

For each one unit increase in Tomatometer rating, movie receipts will increase by 0.0363. The Y intercept, b0, would be the mean receipts when the Tomatometer rating is equal to zero.

(c)

Yˆ  1.3759  0.0363(55)  3.37 ($thousands). The mean receipt per theater for a movie that has a Tomatometer rating of 55% would be $3,370. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxxvii (d) (e)

Yˆ  1.3759  0.0363(5)  1.557 ($thousands). The mean receipt per theater for a movie that has a Tomatometer rating of 5% would be $1,557. From PHStat, Simple Linear Regression Analysis

Regression Statistics Multiple R

0.0801

R Square

0.0064

r2 = 0.0064. So 0.64% of the variation in movie receipts can be explained by the variation in Tomatometer rating.

Copyright ©2024 Pearson Education, Inc.


cxxviii Chapter 16: Time-Series Forecasting 13.73 cont.

(f)

The residual plot reveals no evidence of a violation of linearity and equal variance assumptions. The normal probability plot reveals no substantial departure from the normality assumption.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxxix 13.73 cont.

(g) Coefficients

Standard Error

t Stat

P-value

Lower 95%

Intercept

1.3759

10.8281

0.1271

0.9012

-22.4566

Tomatometer Rating

0.0363

0.1363

0.2665

0.7948

-0.2636

At the 0.05 significance level, there is insufficient evidence of a significant linear relationship between Tomatometer rating and receipts. Because tSTAT = 0.2665 < 2.2010 or p-value = 0.7948 > 0.05, do not reject H0. (h)

From PHStat Confidence Interval Estimate

Data X Value

55

Confidence Level

95%

Intermediate Calculations Sample Size

13

Degrees of Freedom

11

t Value

2.200985

XBar, Sample Mean of X

75.84615

Sum of Squared Differences from XBar

7297.692

Standard Error of the Estimate

11.64107

h Statistic

0.136471

Predicted Y (YHat)

3.37361

For Average Y Interval Half Width

9.4652

Copyright ©2024 Pearson Education, Inc.


cxxx Chapter 16: Time-Series Forecasting Confidence Interval Lower Limit

-6.0916

Confidence Interval Upper Limit

12.83882

For Individual Response Y Interval Half Width

27.3143

Prediction Interval Lower Limit

-23.9406

Prediction Interval Upper Limit

30.68786

6.0916  Y | X 55  12.83882;  23.9406  Yx 55  30.68786

Note: receipts per theater are in $thousands. (i)

13.74

(a) (b)

(c) (d) (e) (f)

(g) (h)

Based on the results from (a) – (h), Tomatometer rating would not be useful in predicting receipts on the first weekend a movie opens. There was a significant relationship between Tomatometer rating and receipts, with the model accounting for 0.64% of the variation in movie receipts. One should be hesitant to use a Tomatometer rating that falls outside of the values included in the dataset. The dataset always contained a small sample size, which can make it difficult violation of assumptions such as normality. b0 = 24.84, b1 = 0.14 24.84 is the portion of estimated mean delivery time that is not affected by the number of cases delivered. For each additional case, the estimated mean delivery time increases by 0.14 minutes. X  24.84  0.14(150)  45.84 Yˆ  24.84  0.14 No, 500 cases is outside the relevant range of the data used to fit the regression equation. r2 = 0.972. So, 97.2% of the variation in delivery time can be explained by the variation in the number of cases. Based on a visual inspection of the graphs of the distribution of residuals and the residuals versus the number of cases, there is no pattern. The model appears to be adequate. t  24.88  t0.05/2  2.1009 with 18 degrees of freedom for   0.05 . Reject H0. There is evidence that the fitted linear regression model is useful. 44.88  Y | X 150  46.80 41.56  YX 150  50.12

13.75

(a)

Partial PHStat output: Intercept Diameter at breast height

(b) (c) (d)

Coefficients Standard Error t Stat 78.79634012 12.21480794 6.450886538 2.673214402 0.374109159 7.145546532

P-value 3.49317E-06 8.59802E-07

b0 = 78.7963, b1 = 2.6732 The estimated mean height of a redwood tree will increase by 2.6732 feet for each additional inch increase in diameter at breast height. Yˆ  78.7963  2.6732 X  78.7963  2.6732  25  145.6267

r 2 = 0.7288. So 72.88% of the variation in the height of the redwood trees can be explained by the variation in diameter at breast height. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxxxi (e) Diameter at breast height Residual Plot 60 Residuals

40 20

0 -20 -40

-60 0

20

40

60

Diameter at breast height

There are clusters of negative residuals at the low and high end of the diameter values. There appears to be some non-linear relationship between height and diameter.

Copyright ©2024 Pearson Education, Inc.


cxxxii Chapter 16: Time-Series Forecasting 13.75 cont.

(e) Normal Probability Plot 60

Residuals

40 20

0 -20 -40 -60 -2

-1

0

1

2

Z Value

(g)

The normal probability plot does not suggest any possible departure from the normality assumption. H 0 : 1  0 vs. H1 : 1  0 Since t-stat = 7.1455 with a p-value which is virtually 0, reject H 0 . There is a significant relationship between the height of redwood trees and the breast diameter at the 0.05 level of significance. 1.8902  1  3.4562

(a)

Independent variable is living space. Dependent variable is asking price.

(f)

13.76

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxxxiii 13.76 cont.

(a)

From PHStat, Simple Linear Regression Analysis

Regression Statistics Multiple R

0.6307

R Square

0.3978

Coefficients

Standard Error

t Stat

P-value

Lower 95%

Intercept

408.2614

Living Space

0.1044

Upper 95%

33.0407

12.3563

0.0000

342.1471

474.3758

0.0167

6.2433

0.0000

0.0710

0.1379

b0 = 408.2614, b1 = 0.1044. (b)

(c)

For each additional square foot of living space in the house, the mean asking price is predicted to increase by $104.40. The estimated asking price of a house with 0 living space is 408.2614 thousand dollars. However, this interpretation is not meaningful because the living space of the house cannot be 0. Y  408.2614  0.1044(2,000)  617.1 thousand dollars.

(d)

r2 = 0.3978. So 39.78% of the variation in asking price is explained by the variation in living space.

(e)

Neither the residual plot nor the normal probability plot reveals any potential violation of the linearity, equal variance, and normality assumptions.

(f)

tSTAT = 6.2433 > 2.0010, p-value is 0.0000. Because p-value < 0.05, reject H0. There is evidence of a linear relationship between asking price and living space.

(g)

0.0710  1  0.1379

(h)

The living space in the house is somewhat useful in predicting the asking price, but because only 39.78% of the variation in asking price is explained by variation in living space, other variables should be considered.

Copyright ©2024 Pearson Education, Inc.


cxxxiv Chapter 16: Time-Series Forecasting 13.77

(a)

Independent variable is asking price. Dependent variable is taxes.

From PHStat Simple Linear Regression Analysis

Regression Statistics Multiple R

0.9903

R Square

0.9807

Coefficients

Standard Error

t Stat

P-value

Lower 95%

Intercept

156.6320

Asking Price

7.9345

Upper 95%

88.6771

1.7663

0.0825

-20.8105

334.0745

0.1448

54.7961

0.0000

7.6447

8.2242

b0 = 156.6320, b1 = 7.9345. (b)

(c)

The Y intercept, b0, would be the mean yearly taxes when the asking price is equal to zero. The literal interpretation is not meaningful in this case because an asking price of zero dollars is not realistic. The sample slope, b1, indicates that for each unit of change in asking price, the predicted mean yearly taxes increase by 7.9345. For each additional thousand dollars in asking price, the predicted mean yearly taxes increase by $7.9345 dollars. Y  156.6320  7.9345(400)  $3,330.43 dollars. The predicted mean yearly taxes for a $400,000 home would be $3,330.43 dollars. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxxxv (d)

r2 = 0.9807, which means that 98.07% of the variation in yearly taxes can be explained by the variation in asking price.

Copyright ©2024 Pearson Education, Inc.


cxxxvi Chapter 16: Time-Series Forecasting 13.77 cont.

(e)

The residual plot indicates that there is very little difference between the predicted Y and the observed value of Y. This is not surprising because r2 = 0.9807. There appears to be no violation of the linearity and equal variance assumptions. The normal probability plot appears to reveal evidence of a violation of the normality assumption. However, both plots show two substantial residual outliers.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxxxvii 13.77 cont.

13.78

(f)

At the 0.05 significance level, there is evidence of a significant linear relationship between yearly taxes and home asking price. Because tSTAT = 54.7961 > 2.0010 or p-value = 0.000 < 0.05, reject H0.

(g)

One can conclude that there is a very strong positive significant relationship between home asking price and yearly taxes, but that the normality assumption of linear regression is not satisfied, due to the presence of outlier(s).

(a)

Independent variable is efficiency ratio. Dependent variable is ROE (return on equity).

From PHStat, Simple Linear Regression Analysis

Regression Statistics Multiple R

0.3237

R Square

0.1048

Coefficients

Standard Error

t Stat

P-value

Lower 95%

Intercept

24.6488

Living Space

-0.1447

2.4836

9.9248

0.0000

19.7203

29.5774

0.0427

-3.3867

0.0010

-0.2295

-0.0599

b0 = 24.6488, b1 = –0.1447. Copyright ©2024 Pearson Education, Inc.

Upper 95%


cxxxviii Chapter 16: Time-Series Forecasting

13.78 cont.

(b)

For each additional point on the efficiency ratio, the predicted mean ROE is estimated to decrease by 0.1447. For an efficiency of 0, the predicted mean ROE is 24.6488.

(c)

Y  24.6488  0.1447(60)  15.9681.

(d)

r2 = 0.1048. So 10.48% of the variation in ROE is explained by the variation in efficiency ratio.

(e)

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxxxix There is no obvious pattern in the residuals, so the assumptions of regression are met. The model appears to be adequate. (f)

tSTAT = –3.3867 < –1.9845; reject H0. There is evidence of a linear relationship between efficiency ratio and ROE.

Copyright ©2024 Pearson Education, Inc.


cxl Chapter 16: Time-Series Forecasting 13.78 cont.

(g)

From PHStat Confidence Interval Estimate

Data X Value

60

Confidence Level

95%

For Average Y Interval Half Width

0.8153

Confidence Interval Lower Limit

15.1528

Confidence Interval Upper Limit

16.78338

For Individual Response Y Interval Half Width

7.8902

Prediction Interval Lower Limit

8.0780

Prediction Interval Upper Limit

23.85826

15.1528  Y X 60  11.7834 and 8.0780  YX 60  23.8583

(h) (i) 13.79

0.2295  1  0.0599 There is a small relationship between efficiency ratio and ROE.

(a)

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxli

(b)

(c)

b0 = 0.4872, b1 = 0.0123 0.4872 is the portion of estimated mean completion time that is not affected by the number of invoices processed. When there is no invoice to process, the mean completion time is estimated to be 0.4872 hours. Of course, this is not a very meaningful interpretation in the context of the problem. For each additional invoice processed, the estimated mean completion time increases by 0.0123 hours. Yˆ  0.4872  0.0123X  0.4872  0.0123(150)  2.3304

Copyright ©2024 Pearson Education, Inc.


cxlii Chapter 16: Time-Series Forecasting 13.79 cont.

(d)

r2 = 0.8623. 86.23% of the variation in completion time can be explained by the variation in the number of invoices processed.

(e)

(f)

(g)

Based on a visual inspection of the graphs of the distribution of residuals and the residuals versus the number of invoices and time, there appears to be autocorrelation in the residuals. D = 0.69 < 1.37 = dL. There is evidence of positive autocorrelation. The model does not appear to be adequate. The number of invoices and, hence, the time needed to process them, tend to be high for a few days in a row during historically heavier shopping days or during advertised sales days. This could be the possible causes for positive autocorrelation. Due to the violation of the independence of errors assumption, the prediction made in (c) is very likely to be erroneous. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxliii

Copyright ©2024 Pearson Education, Inc.


cxliv Chapter 16: Time-Series Forecasting (a) Scatter Plot

O-ring Damage Index

12 10 8 6 4 2 0 0

20

40

60

80

Temperature (degrees F)

There is not any clear relationship between atmospheric temperature and O-ring damage from the scatter plot. (b) 12 10 O-ring Damage Index

13.80

8 6 4 2 0 0

20

40

60

80

100

-2 -4 Temperature (degrees F)

(c)

(d)

(e) (g)

In (b), there are 16 observations with an O-ring damage index of 0 for a variety of temperatures. If one concentrates on these observations with no O-ring damage, there is obviously no relationship between O-ring damage index and temperature. If all observations are used, the observations with no O-ring damage will bias the estimated relationship. If the intention is to investigate the relationship between the degrees of Oring damage and atmospheric temperature, it makes sense to focus only on the flights in which there was O-ring damage. Prediction should not be made for an atmospheric temperature of 31 0F because it is outside the range of the temperature variable in the data. Such prediction will involve extrapolation, which assumes that any relationship between two variables will continue to hold outside the domain of the temperature variable. Yˆ  18.036  0.240X A nonlinear model is more appropriate for these data.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxlv 13.80 cont.

(h) Temperature Residual Plot 7 6 5

Residuals

4 3 2 1 0 -1 -2 -3

0

10

20

30

40

50

60

70

80

90

Temperature

The string of negative residuals and positive residuals that lie on a straight line with a positive slope in the lower-right corner of the plot is a strong indication that a nonlinear model should be used if all 23 observations are to be used in the fit. 13.81

(a)

Independent variable is ERA (earned run average). Dependent variable is wins.

Copyright ©2024 Pearson Education, Inc.


cxlvi Chapter 16: Time-Series Forecasting 13.81 cont.

(a)

From PHStat Simple Linear Regression Analysis

Regression Statistics Multiple R

0.8762

R Square

0.7677

Coefficients

Standard Error

t Stat

P-value

Lower 95%

Intercept

172.2106

ERA

-23.0039

Upper 95%

9.5731

17.9889

0.0000

152.6009

191.8203

2.3915

-9.6189

0.0000

-27.9028

-18.1051

b0 = 172.2106 and b1 = –23.0039. (b)

The Y intercept, b0, would be the mean number of wins when the team’s ERA is equal to zero. The literal interpretation is not meaningful in this case because a team ERA of zero is not realistic. The sample slope, b1, indicates that for each unit increase in ERA, the predicted mean number of wins decreases by 23.0039.

(c) (d)

Y  172.2106  23.0039(4.5)  68.693 wins. r2 = 0.7677, which means that 76.77% of the variation in season wins can be explained by the variation in a team’s ERA.

(e)

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxlvii

Copyright ©2024 Pearson Education, Inc.


cxlviii Chapter 16: Time-Series Forecasting 13.81 cont.

(e)

(f)

(g)

The residual plot reveals no evidence of a pattern in the residuals. The plot reveals that the residuals, the difference between the observed value of Y and the predicted value of Y, are spread above and below 0 for different values of summated rating. There appears to be no violation of the linearity and equal variance assumptions. The normal probability plot reveals no evidence of departure from the normality assumption. At the 0.05 significance level, there is evidence of a negative linear relationship between the number of wins and team ERA. Because tSTAT = –9.6189 < -2.0484 or p-value = 0.000 < 0.05, reject H0. From PHStat

Confidence Interval Estimate

Data X Value

4.5

Confidence Level

95%

For Average Y Interval Half Width

3.7582

Confidence Interval Lower Limit

64.9347

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxlix

Confidence Interval Upper Limit

72.45112

64.9347  Y X 4.5  72.45112. The 95% confidence interval estimate is that the

population mean number of wins for all teams with an ERA of 4.5 is between 64.93 and 72.45.

Copyright ©2024 Pearson Education, Inc.


cl Chapter 16: Time-Series Forecasting 13.81 cont.

(h)

From PHStat

Confidence Interval Estimate

Data X Value

4.5

Confidence Level

95%

For Individual Response Y

(i) (j)

13.82

Interval Half Width

15.2245

Prediction Interval Lower Limit

53.4684

Prediction Interval Upper Limit

83.91734

53.4684  YX  4.5  83.91734. The 95% prediction interval estimate is that the number of wins for an individual team with an ERA of 4.5 is between 53.47 and 83.92. b  t /2 Sb1  23.0039  2.0484  2.3915  . 27.90  1  18.11. The population in this case could include all games played by all teams for the last five years.

(k)

Other variables that one might consider for inclusion in the model would be saves, runs scored per game, batting average, and the number of home runs.

(l)

One can conclude there is a significant negative relationship between team ERA and the number of wins. As the ERA decreases, the expected number of wins increases. The regression equation revealed that 76.77% of the variation in wins can be explained by team ERA.

(a)

Independent variable is revenue. Dependent variable is market value. From PHStat Simple Linear Regression Analysis

Regression Statistics Multiple R

0.9220

R Square

0.8501

Coefficients

Standard

t Stat

P-value

Copyright ©2024 Pearson Education, Inc.

Lower

Upper


Solutions to End-of-Section and Chapter Review Problems cli Error

95%

95%

Intercept

-1.9291

0.3284

-5.8748

0.0000

-2.6018

-1.2565

Revenue

0.0139

0.0011

12.6025

0.0000

0.0116

0.0161

b0 = –1.9291, b1 = 0.0139.

13.82 cont.

(b)

For each additional million-dollar increase in revenue, the current value will increase by an estimated 0.0139 billion. Literal interpretation of b0 is not meaningful because an operating team cannot have negative revenue.

(c) (d)

Y  1.9291  0.0139(250)  1.5406 billion r2 = 0.8501. 85.01% of the variation in the value of an NBA basketball team can be explained by the variation in its annual revenue.

(e)

Copyright ©2024 Pearson Education, Inc.


clii Chapter 16: Time-Series Forecasting

There does not appear to be a pattern in the residual plot. The assumptions of regression do not appear to be seriously violated. (f)

tSTAT = 12.6025 > 2.0484 or because the p-value is 0.0000 < 0.05, reject H0 at the 5% level of significance. There is evidence of a linear relationship between annual revenue and value.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cliii 13.82 cont.

(g)

From PHStat Confidence Interval Estimate Data X Value

250

Confidence Level

95%

For Average Y Interval Half Width

0.1662

Confidence Interval Lower Limit

1.3743

Confidence Interval Upper Limit

1.706757

1.3743  Y X 250  1.7068 billons

(h)

From PHStat Confidence Interval Estimate Data X Value

250

Confidence Level

95%

For Individual Response Y

(i)

13.83

Interval Half Width

0.7665

Prediction Interval Lower Limit

0.7741

Prediction Interval Upper Limit

2.307016

0.7741  YX 250  2.3070 billons The strength of the relationship between revenue and value is approximately the same for NBA basketball teams and for European soccer teams but lower than for MLB baseball teams.

The textbook asks the student for 13.83 (a) to repeat (a) through (h) shown in 13.82. However, the student is to use data from a different file titled “Soccer Values.” (a)

(a)

Independent variable is revenue. Dependent variable is value. From PHStat Copyright ©2024 Pearson Education, Inc.


cliv Chapter 16: Time-Series Forecasting Simple Linear Regression Analysis

Regression Statistics Multiple R

0.9666

R Square

0.9342

Coefficients

Standard Error

t Stat

P-value

Lower 95%

Upper 95%

Intercept

-748.4724

174.1255

-4.2985

0.0006

-1117.6020

-379.3428

Revenue

5.0568

0.3354

15.0761

0.0000

4.3458

5.7679

b0 = –748.4724, b1 = 5.0568.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems clv 13.83 cont.

(a)

(b) For each additional million-dollar increase in revenue, the current value will increase by an estimated 5.0568 billion. Literal interpretation of b0 is not meaningful because an operating team cannot have negative revenue. (c) Y  748.4724  5.0568(250)  515.73 billion dollars (d) r2 = 0.9342. 93.42% of the variation in the value of an European soccer team can be explained by the variation in its annual revenue. (e)

Copyright ©2024 Pearson Education, Inc.


clvi Chapter 16: Time-Series Forecasting 13.83 cont.

(a)

(e)

There does not appear to be a pattern in the residual plot. The assumptions of regression do not appear to be seriously violated.

(f)

tSTAT = 15.0761 > 2.1199 or because the p-value is 0.0000 < 0.05, reject H0 at the 5% level of significance. There is evidence of a linear relationship between annual revenue and value. (g) From PHStat Confidence Interval Estimate Data X Value

250

Confidence Level

95%

For Average Y Interval Half Width

211.5163

Confidence Interval Lower Limit

304.2134

Confidence Interval Upper Limit

727.2461

304.2134  Y X 250  727.2461 billons

(h) From PHStat Confidence Interval Estimate Data X Value

250

Confidence Level

95%

For Individual Response Y Interval Half Width

582.0576

Prediction Interval Lower Limit

-66.3278

Prediction Interval Upper Limit

1097.787

66.3278  YX  250  1,097.787 billons

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems clvii 13.83 cont.

(b)

For 13.83 (b), the student is to compare the results from 13.83 (a) to similar problems in the chapter. Soccer Values dataset analysis: Simple Linear Regression Analysis

Regression Statistics Multiple R

0.9666

R Square

0.9342

Adjusted R Square

0.9301

Standard Error

255.7971

Observations

18

ANOVA df

SS

MS

F

Significance F

Regression

1 14872071.4841 14872071.4841 227.2900

Residual

16

Total

17 15918985.6111

1046914.1270

0.0000

65432.1329

Coefficients

Standard Error

t Stat

P-value

Lower 95%

Upper 95%

Intercept

-748.4724

174.1255

-4.2985

0.0006

-1117.6020

-379.3428

Revenue

5.0568

0.3354 15.0761

0.0000

4.3458

5.7679

MLB Values dataset analysis Simple Linear Regression Analysis Regression Statistics Multiple R

0.8293

R Square

0.6877

Copyright ©2024 Pearson Education, Inc.


clviii Chapter 16: Time-Series Forecasting Adjusted R Square

0.6766

Standard Error

654.9821

Observations

30

ANOVA df

SS

Regression

1

26454210.9591

Residual

28

12012043.2076

Total

29

38466254.1667

Intercept

F

Significance F

26454210.9591 61.6646

0.0000

429001.5431

Coefficients

Standard Error

t Stat

Pvalue

Lower 95%

Upper 95%

-1410.6988

461.6593

-3.0557 0.0049

-2356.3650

-465.0326

10.9927

1.3999

7.8527 0.0000

8.1252

13.8602

Annual Revenue

13.83 cont.

MS

(b) NBA Financial dataset analysis: Simple Linear Regression Analysis Regression Statistics Multiple R

0.9220

R Square

0.8501

Adjusted R Square

0.8448

Standard Error

0.3653

Observations

30

ANOVA df

SS

MS

F

Copyright ©2024 Pearson Education, Inc.

Significance F


Solutions to End-of-Section and Chapter Review Problems clix Regression

1

21.1908

21.1908

Residual

28

3.7359

0.1334

Total

29

24.9267

Coefficients

Standard Error

t Stat

Intercept

-1.9291

0.3284

Revenue

0.0139

158.8236

0.0000

P-value

Lower 95%

Upper 95%

-5.8748

0.0000

-2.6018

-1.2565

0.0011 12.6025

0.0000

0.0116

0.0161

Among the three franchises, annual revenue could explain the most variation in franchise value for the Soccer with a r2 = 0.9342. Basketball had a r2 = 0.8501 and Baseball had a r2 = 0.6877. 13.84

(a) (b) (c) (d) (e) (f) (g) (h)

13.85

b0 = –2,629.222, b1 = 82.472. For each additional centimeter in circumference, the weight is estimated to increase by 82.472 grams. 2,319.08 grams. Yes, because circumference is a very strong predictor of weight. r2 = 0.937. There appears to be a nonlinear relationship between circumference and weight. p-value is virtually 0 < 0.05; reject H0. 72.7875  1  92.156.

Solution located in 13.83 (b) of the present solutions.

Chapter 14

14.1

(a)

(b) 14.2

14.3

(a)

Holding constant the effect of X2, for each increase of one unit in X1, the response variable Y is estimated to increase a mean of 4 units. Holding constant the effect of X1, for each increase of one unit in X2, the response variable Y is estimated to increase an average of 5 units. The Y-intercept 8 is the estimate of the mean value of Y if X1 and X2 are both 0.

(b)

Holding constant the effect of X2, for each increase of one unit in X1, the response variable Y is estimated to decrease an average of 2 units. Holding constant the effect of X1, for each increase of one unit in X2, the response variable Y is estimated to increase an average of 7 units. The Y-intercept 50 is the estimate of the mean value of Y if X1 and X2 are both 0.

(a)

Y  11.002079  0.6684 X1  0.8317 X 2 Copyright ©2024 Pearson Education, Inc.


clx Chapter 16: Time-Series Forecasting (b)

14.4

(c)

For each one unit increase in revenue, one would estimate that the predicted mean commitment would increase 0.6683647 units, while holding efficiency constant. For each one unit increase in efficiency, one would estimate that the predicted mean commitment would increase 0.831739 units, while holding revenue constant. The model uses both total revenue, percent of private donations remaining after fundraising expenses to predict commitment, and efficiency, the percent of total expenses that are directly allocated to charitable services. The model may be more effective in predicting commitment compared to a model using only one of these variables is included.

(a)

From PHStat Coefficients Intercept

1.3847

Efficiency Ratio

-0.0072

Risk-Based Capital

0.0181

(b)

Y  1.3847  0.0072 X1  0.0181X 2 where X1 = Efficiency Ratio, X 2 = growth rate (Risk-based capital) For a given growth rate, for each increase of 1% in efficiency ratio, ROAA decreases by 0.0072%. For a given efficiency ratio, for each increase of 1% in growth rate, ROAA increases by 0.0181%

(c)

Y  1.3847  0.0072(60)  0.0181(15)  1.2231

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems clxi 14.4 cont.

(c)

From PHStat Confidence Interval Estimate and Prediction Interval Data Confidence Level

95%

Efficiency Ratio given value

60

Risk-Based Capital given value

15

For Average Predicted Y (YHat) Confidence Interval Lower Limit

1.121979

Confidence Interval Upper Limit

1.324142

For Individual Response Y -0.1988

Prediction Interval Upper Limit

2.64492

(d)

1.1220  Y | X  1.3241

(e) (f)

0.1988  YX  2.64492 The interval in (e) is narrower because it is estimating the mean value, not an individual value. The model uses both the efficiency ratio and growth rate to predict ROAA. This may produce a better model than if only one of these independent variables is included.

(g)

14.5

Prediction Interval Lower Limit

(a) Intercept alcohol chlorides

Coefficients Standard Error t Stat P-value 1.1592 1.2719 0.9114 0.3667 0.4962 0.1094 4.5378 0.0000 -9.6331 3.6818 -2.6164 0.0119

Yˆ  1.1592  0.4962 X 1  9.6331X 2 (b)

(c)

For a given amount of chlorides, each increase of one percent in alcohol is estimated to result in a mean increase in quality rating of 0.4962. For a given alcohol content, each increase of one unit in chlorides is estimated to result in the mean decrease in quality rating of 9.6331. The interpretation of b0 has no practical meaning here because it would have meant the Copyright ©2024 Pearson Education, Inc.


clxii Chapter 16: Time-Series Forecasting

(d)

estimated mean quality rating when a wine has 0 alcohol content and 0 amount of chlorides. Yˆ  1.1592  0.4962 10  9.6331.08 = 5.3510.

(e)

5.0635  Y | X  5.6386

(f) (g)

3.5484  YX  7.1536 The model uses both alcohol content (%) and the amount of chlorides to predict wine quality. This may produce a better model than if only one of these independent variables is included.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems clxiii 14.6

(a)

From PHStat Coefficients

(b)

(c) (d)

Intercept

654.7054

Worldwide Revenues

24.6789

Number of New Graduates Hired

1.0579

Y  654.7054  24.6789 X1  1.0579 X 2 where X1 = Revenues, X 2 = New Hires For a given number of new graduates, for each increase of $1 billion in worldwide revenue, the mean number of full-time jobs added is predicted to increase by 24.6789. For a given $1 billion in worldwide revenue, for each increase of number of new graduates hired, the mean number of full-time jobs added is predicted to increase by 1.0579. The Y intercept has no meaning in this problem. Holding the other independent variable constant, number of new graduates has a higher slope than worldwide revenue.

14.7

(a) (b)

(c) (d) (e)

14.8

Yˆ  330.675  1.764865 X 1  0.13897 X 2 For a given amount of remote hours, each increase of one unit in total staff present is estimated to result in a mean increase in standby hours of 1.764865. For a given amount of total staff present, each increase of one unit in remote hours is estimated to result in a mean decrease in standby hours of 0.13897. The interpretation of b0 has no practical meaning here because it provides an estimate of the mean standby hours when there was no total staff present and no remote hours. Yˆi  330.675  1.764865(310)  0.13897(400)  160.845 141.7856  Y | X  179.9074

(f) (g)

85.2014  YX  236.4915 The model uses both the total staff present and the remote hours to predict standby hours. This may produce a better model than if only one of these independent variables is included.

(a)

From PHStat Coefficients Intercept

1023.3446

House Size

0.0229 Copyright ©2024 Pearson Education, Inc.


clxiv Chapter 16: Time-Series Forecasting Age

-6.3465

Y  1,023.3446  0.0229 X1  6.3465 X 2 where X1 = House Size, X 2 = Age

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems clxv 14.8 cont.

(b)

(c) (d)

For a given age, each increase by one square foot in house size is estimated to result in an increase in the mean asking price of $0.0229 thousands. For a given house size, each increase of one year in age is estimated to result in the decrease in mean asking price of $6.3465 thousands. The interpretation of b0 has no practical meaning here because it would represent the estimated asking price of a new house that has zero square feet.

Y  1,023.3446  0.0229(25,000)  6.3465(55)  1,247.311 thousands. From PHStat Confidence Interval Estimate and Prediction Interval Data Confidence Level

95%

House Size given value

25000

Age given value

55

For Average Predicted Y (YHat) Confidence Interval Lower Limit

1165.016

Confidence Interval Upper Limit

1329.606

For Individual Response Y Prediction Interval Lower Limit

740.8534

Prediction Interval Upper Limit

1753.769

(e)

1,165.016  Y | X  1,329.606

(f)

740.8534  YX  1,753.769

14.9

There is no evidence of a violation of the assumptions of regression.

14.10

(a)

Copyright ©2024 Pearson Education, Inc.


clxvi Chapter 16: Time-Series Forecasting

14.10 cont.

(b)

(c)

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems clxvii

(d)

Based on a residual analysis of (a) to (c), there is no evidence of a violation of the assumptions of regression.

Copyright ©2024 Pearson Education, Inc.


clxviii Chapter 16: Time-Series Forecasting 14.11

(a)

(b)

(c)

(d)

The residual plots do not reveal any specific pattern. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems clxix

Copyright ©2024 Pearson Education, Inc.


clxx Chapter 16: Time-Series Forecasting 14.11 cont.

(e)

Since the data set is cross-sectional, it is inappropriate to compute the Durbin-Watson statistic.

14.12

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems clxxi 14.12 cont.

(a) (b) (c) 14.13

There is no evidence of a violation of the assumptions. Because the data are not collected over time, the Durbin-Watson test is not appropriate. They are valid.

(a)

Copyright ©2024 Pearson Education, Inc.


clxxii Chapter 16: Time-Series Forecasting 14.13 cont.

(a)

Based upon a residual analysis, the model appears adequate. (b)

There is no evidence of a pattern over time.

(c)

D = 1.79

(d)

D = 1.79 > 1.55. There is no evidence of positive autocorrelation in the residuals.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems clxxiii 14.14

(a)

Copyright ©2024 Pearson Education, Inc.


clxxiv Chapter 16: Time-Series Forecasting 14.14 cont.

(a)

The residual analysis reveals no patterns.

14.15

(b)

Because the data are not collected over time, the Durbin-Watson test is not appropriate.

(c)

There are no apparent violations of the assumptions.

(a)

MSR  SSR / k  60 / 2  30

(b) (c)

(d) (e) 14.16

(a) (b) (c)

(d) (e)

MSE  SSE / (n  k  1)  120 / 18  6.67 FSTAT  MSR / MSE  30 / 6.67  4.5 FSTAT  4.5  FU (2,2121)  3.555 . Reject H0. There is evidence of a significant linear

relationship. SSR 60 r2    0.3333 SST 180 n 1   2 radj  1  1  rY2.12  = 0.2592 n  k  1   MSR  SSR / k  30 / 2  15

MSE  SSE / (n  k  1)  120 / 10  12 FSTAT  MSR / MSE  15 / 12  1.25 FSTAT  1.25  FU (2,13 21)  4.103 . Do not reject H0. There is not sufficient evidence of a significant linear relationship. SSR 30 r2    0.2 SST 150 n 1   2 radj  1  1  rY2.12  = 0.04 n  k  1   Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems clxxv 14.17

(a)

(b)

Copyright ©2024 Pearson Education, Inc.


clxxvi Chapter 16: Time-Series Forecasting 14.17 cont.

(c)

(d)

(e)

14.18

(a) 0.00% of the variation in price-to-book-value ratio is explained by the variation in return on equity after adjusting for number of independent variables and sample size. (b) 2.63% of the variation in price-to-book-value ratio is explained by the variation in growth after adjusting for number of independent variables and sample size. (c) 1.80% of the variation in price-to-book-value ratio is explained by the variation in return on equity and growth after adjusting for number of independent variables and sample size. The second model with growth as the only independent variable has the highest adjusted r2. However, only 2.63% of the variation in price-to-book-value ratio is explained by the variation in growth after adjusting for the number independent variables and sample size. Because all three models are rejected, there is no best model in this case.

p-value for revenue is 0.0395 < 0.05 and the p-value for efficiency is less than 0.0001 < 0.05. Reject H0 for each of the independent variables. There is evidence of a significant linear relationship with each of the independent variables.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems clxxvii 14.19

(a) df Regression Residual Total

SS MS F Significance F 27.2241 13.6120 17.3963 0.0000 36.7759 0.7825 64.0000

2 47 49

MSR  17.3963 MSE Since p-value = 0.0000 < 0.05, reject H0. There is evidence of a significant linear relationship. (b) p-value = 0.0000. The probability of obtaining an F test statistic of 17.3963 or larger is 0.0000 if H0 is true. SSR rY2.12   0.4254. So, 42.54% of the variation in quality rating can be explained by (c) SST variation in the percentage of alcohol and variation in chorides. n 1   2  1  (1  rY2.12 )  0.4009 (d) radj n  k  1   FSTAT 

14.20

From PHStat Regression Analysis Regression Statistics Multiple R

0.5307

R Square

0.2816

Adjusted R Square

0.2743

Standard Error

0.7192

Observations

200

ANOVA df

SS

MS

F 38.6148

Regression

2

39.9437

19.9719

Residual

197

101.8898

0.5172

Total

199

141.8335

Coefficients

Standard Error

t Stat

Copyright ©2024 Pearson Education, Inc.

P-value


clxxviii Chapter 16: Time-Series Forecasting Intercept

1.3847

0.3699

3.7438

0.0002

Efficiency Ratio

-0.0072

0.0060

-1.1984

0.2322

Risk-Based Capital

0.0181

0.0021

8.6824

0.0000

(a) (b) (c) (d)

FSTAT = 38.6148 > 3.00; reject H0. p-value = 0.0000. The probability of obtaining an FSTAT value > 38.6148 if the null hypothesis is true is 0.0000. r2 = 0.2816. 28.16% of the variation in ROAA can be explained by variation in efficiency ratio and variation in growth. 2 radj  0.2743

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems clxxix 14.21

(a)

SSR 27,662.54   13,831 k 2 SSE 28,802.07 MSE    1, 252 (n  k  1) 23 MSR 

FSTAT 

MSR 13,831   11.05 MSE 1,252

FSTAT  11.05  FU (2,2621)  3.422 . Reject H0. There is evidence of a significant linear (b) (c)

(d)

relationship. p-value < 0.001. The probability of obtaining an F test statistic of 11.05 or larger is less than 0.001 if H0 is true. SSR 27,662.54 rY2.12    0.4899 . So, 48.99% of the variation in standby hours can be SST 56,464.62 explained by variation in the total staff present and remote hours. n 1  26  1    2 radj  1  (1  rY2.12 )  1  (1  0.4899)  0.4456  n  k  1 26  2  1   

Copyright ©2024 Pearson Education, Inc.


clxxx Chapter 16: Time-Series Forecasting 14.22

From PHStat Regression Analysis Regression Statistics Multiple R

0.7123

R Square

0.5074

Adjusted R Square

0.4939

Standard Error

1829.4342

Observations

76

ANOVA df

SS

MS

Regression

2 251630055.3805 125815027.6902

Residual

73 244318546.0274

Total

75 495948601.4079

F 37.5923

3346829.3976

Coefficients

Standard Error

Intercept

654.7054

257.1081

2.5464

0.0130

Worldwide Revenues

24.6789

7.0951

3.4783

0.0009

Number of New Graduates Hired

1.0579

0.1643

6.4385

0.0000

(a) (b) (c) (d)

t Stat

P-value

FSTAT = 37.5923 > 3.13; reject H0. There is evidence of a significant linear relationship. p-value = 0.0000. The probability of obtaining an FSTAT value > 37.5923 if the null hypothesis is true is 0.0000. r2 = 0.5074. 50.74% of the variation in full-time jobs added can be explained by variation in worldwide revenue and variation in number of new graduates. 2 radj  0.4939

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems clxxxi 14.23

From PHStat Regression Analysis Regression Statistics Multiple R

0.8379

R Square

0.7020

Adjusted R Square

0.6919

Standard Error

249.7391

Observations

62

ANOVA df

MS

F 69.5067

Regression

2

8670214.3248

4335107.1624

Residual

59

3679806.9172

62369.6088

Total

61

12350021.2419

Coefficients

Standard Error

Intercept

1023.3446

92.9492

11.0097

0.0000

House Size

0.0229

0.0024

9.4642

0.0000

Age

-6.3465

1.1018

-5.7603

0.0000

(a) (b) (c) (d) 14.24

SS

(a) (b)

t Stat

P-value

FSTAT = 69.5067 > 3.15; reject H0. p-value = 0.0000. The probability of obtaining an FSTAT value > 69.5067 if the null hypothesis is true is 0.0000. r2 = 0.7020. 70.20% of the variation in asking price of a house can be explained by variation in house size and age of the house. 2 radj  0.6919 The slope of X2 in terms of the t statistic is 3.75 which is larger than the slope of X1 in terms of the t statistic which is 3.33. 95% confidence interval on 1 : b1  tnk 1Sb1 , 4  2.1098 1.2 

1.46824  1  6.53176

Copyright ©2024 Pearson Education, Inc.


clxxxii Chapter 16: Time-Series Forecasting (c)

For X1: tSTAT 

b1 4   3.33  t17  2.1098 with 17 degrees of freedom for  = 0.05. Sb1 1.2

Reject H0. There is evidence that the variable X1 contributes to a model already containing X2. b 3  3.75  t17  2.1098 with 17 degrees of freedom for  = 0.05. For X2: tSTAT  2  Sb2 0.8 Reject H0. There is evidence that the variable X2 contributes to a model already containing X1. Both variables X1 and X2 should be included in the model. 14.25

(a)

95% confidence interval estimate of the population slope between commitment and revenue: 0.6683647 ± 1.9853(0.320077), 0.032916 ≤ β1 ≤ 1.303814. 95% confidence interval estimate of the population slope between commitment and efficiency: 0.8317339 ± 1.9853(0.077736), 0.677405 ≤ β2 0.986063.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems clxxxiii 14.25

(b)

cont.

Note: X1 = revenue and X2 = efficiency. b 0.6683647  2.09  1.9853 and p-value = 0.0395. For X1: tSTAT  1  Sb1 0.320077 Because p-value < 0.05, reject H0. b 0.8317339  10.7  1.9853 and p-value < 0.0001. For X2: tSTAT  2  Sb2 0.077736 Because p-value < 0.05, reject H0. Because H0 is rejected for both independent variables, both should be included in the model.

14.26

From PHStat Coefficients

Standard Error

Intercept

1.3847

0.3699

3.7438 0.0002

0.6553

2.1141

Efficiency Ratio

-0.0072

0.0060

-1.1984 0.2322

-0.0191

0.0047

Risk-Based Capital

0.0181

0.0021

8.6824 0.0000

0.0140

0.0222

(a)

(b)

t Stat

Pvalue

Lower 95%

Upper 95%

95% confidence interval on 1: b1  tSb1 , 0.0072  1.98(0.0060),

0.0191  1  0.0047 b 0.0072  1.1984  1.96. Do not reject H0. For X1 : tSTAT = 1  Sb1 0.0060 There is no evidence that X1 contributes to a model already containing X2. b 0.0181  8.6824  1.96. Reject H0. For X2 : tSTAT = 2  Sb2 0.0021 There is evidence that X2 contributes to a model already containing X1. X2 (risk-based capital) should be included in the model.

14.27

(a) Intercept alcohol chlorides

(b)

Coefficients Standard Error t Stat P-value 1.1592 1.2719 0.9114 0.3667 0.4962 0.1094 4.5378 0.0000 -9.6331 3.6818 -2.6164 0.0119

0.2762  1  0.7162 b For X1: tSTAT  1  4.5378, p-value = 0.0000. Since p-value < 0.05, reject H0. Sb1 There is enough evidence that the variable percentage of alcohol contributes to a model already containing chlorides. b For X2: tSTAT  2  2.6164, p-value = 0.0119. Since p-value < 0.05, reject H0. Sb2 Copyright ©2024 Pearson Education, Inc.


clxxxiv Chapter 16: Time-Series Forecasting There is enough evidence that the variable chlorides contributes to a model already containing percentage of alcohol. Both percentage alcohol and chlorides should be included in the model.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems clxxxv 14.28

From PHStat Coefficients

Standard Error

t Stat

P-value

Lower 95%

Intercept

654.7054

257.1081

2.5464

0.0130

142.2898 1167.1210

Worldwide Revenues

24.6789

7.0951

3.4783

0.0009

10.5383

38.8196

Number of New Graduates Hired

1.0579

0.1643

6.4385

0.0000

0.7304

1.3853

(a) (b)

Upper 95%

10.5383  1  38.8196 For X1 : tSTAT = 3.4783 > 1.993. Reject H0. There is evidence that X1 contributes to a model already containing X2. For X2 : tSTAT = 6.4385 > 1.993. Reject H0. There is evidence that X2 contributes to a model already containing X1. Both variables contribute to a model that includes the other variable. You should consider using both in the model.

14.29

(a)

(b)

95% confidence interval on 1 : b1  tnk 1sb1 , 1.7649  2.0687  0.379 

0.9809  1  2.5489 b 1.7649  4.66  t23  2.0687 with 23 degrees of freedom for For X1: tSTAT  1  Sb1 0.379  = 0.05. Reject H0. There is evidence that the variable X1 contributes to a model already containing X2. b 0.1390  2.36  t23  2.0687 with 23 degrees of freedom for For X2: tSTAT  2  Sb2 0.0588  = 0.05. Reject H0. There is evidence that the variable X2 contributes to a model already containing X1. Both variables X1 and X2 should be included in the model.

Copyright ©2024 Pearson Education, Inc.


clxxxvi Chapter 16: Time-Series Forecasting 14.30

From PHStat Standard Coefficients Error

t Stat

P-value

Lower 95%

Intercept

1023.3446

92.9492

11.0097

0.0000

837.3537 1209.3356

House Size

0.0229

0.0024

9.4642

0.0000

0.0181

0.0278

Age

-6.3465

1.1018

-5.7603

0.0000

-8.5512

-4.1419

(a)

14.31

Upper 95%

0.0181  1  0.0278

(b)

For X1: tSTAT  9.4642 and p-value = 0.0000. Since p-value < 0.05, reject H0. There is evidence that the variable X1 contributes to a model already containing X2. For X2: tSTAT  5.7603 and p-value = 0.0000. Since p-value < 0.05, reject H0. There is evidence that the variable X2 contributes to a model already containing X1. Both X1 (house size) and X2 (age) should be included in the model.

(a)

For X1: SSR( X1 X 2 )  SSR( X1 and X 2 )  SSR( X 2 )  60  25  35

SSR( X1 X 2 )

SSR( X 2 X1 )

35  3.79  FU (1,13)  4.67 with 1 and 13 degrees of MSE 120 / 13 freedom and   0.05 . Do no reject H0. There is not sufficient evidence that the variable X1 contributes to a model already containing X2. For X2: SSR( X 2 X1 )  SSR( X1 and X 2 )  SSR( X1 )  60  45  15 FSTAT 

(b)

15  1.625  FU (1,13)  4.67 with 1 and 13 degrees of MSE 120 / 13 freedom and   0.05 . Do not reject H0. There is not sufficient evidence that the variable X2 contributes to a model already containing X1. Neither independent variable X1 nor X2 makes a significant contribution to the model in the presence of the other variable. SSR( X1 X 2 ) 35 rY21.2   = 0.2258 SST  SSR( X1 and X 2 )  SSR( X1 X 2 ) 180  60  35 Holding constant the effect of variable X2, 22.58% of the variation in Y can be explained by the variation in variable X1. SSR( X 2 X1 ) 15 rY22.1   = 0.1111 SST  SSR( X1 and X 2 )  SSR( X 2 X 1 ) 180  60  15 Holding constant the effect of variable X1, 11.11% of the variation in Y can be explained by the variation in variable X2.

(a)

For X1: SSR( X1 X 2 )  SSR( X1 and X 2 )  SSR( X 2 )  30  15  15

FSTAT 

14.32

SSR( X1 X 2 )

15  1.25  FU (1,10)  4.965 with 1 and 10 degrees of MSE 120 / 10 freedom and   0.05 . Do not reject H0. There is not sufficient evidence that the variable X1 contributes to a model already containing X2. FSTAT 

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems clxxxvii

Copyright ©2024 Pearson Education, Inc.


clxxxviii Chapter 16: Time-Series Forecasting 14.32

(a)

For X2: SSR( X 2 X1 )  SSR( X1 and X 2 )  SSR( X1 )  30  20  10

(b)

10  0.833  FU (1,10)  4.965 with 1 and 10 degrees of MSE 120 / 10 freedom and   0.05 . Do not reject H0. There is not sufficient evidence that the variable X2 contributes to a model already containing X1. Neither independent variable X1 nor X2 makes a significant contribution to the model in the presence of the other variable. Also, the overall regression equation involving both independent variables is not significant: SSR( X1 X 2 ) 15 rY21.2   = 0.1111. SST  SSR( X 1 and X 2 )  SSR( X 1 X 2 ) 150  30  15 Holding constant the effect of variable X2, 11.11% of the variation in Y can be explained by the variation in variable X1. SSR( X 2 X1 ) 10 rY22.1   = 0.0769. SST  SSR( X1 and X 2 )  SSR( X 2 X 1 ) 150  30  10 Holding constant the effect of variable X1, 7.69% of the variation in Y can be explained by the variation in variable X2.

(a)

For X1: SSR( X1 X 2 )  SSR( X1 and X 2 )  SSR( X 2 )  27.2241 – 11.1119 = 16.1122

FSTAT 

cont.

14.33

SSR( X 2 X1 )

SSR( X1 X 2 )

SSR( X 2 X1 )

16.1122 = 20.5916 with 1 and 47 degrees of freedom, and 0.7825 MSE p-value = 0.0000. Reject H0. There is sufficient evidence that the variable percentage alcohol contributes to a model already containing chlorides. For X2: SSR( X 2 X1 )  SSR( X1 and X 2 )  SSR( X1 )  27.2241 – 21.8677 = 5.3564

FSTAT 

5.3564 = 6.8455 with 1 and 47 degrees of freedom and 0.7825 MSE p-value = 0.0119. Reject H0. There is enough evidence that the variable chlorides contributes to a model already containing percentage alcohol. Since both percentage alcohol and chlorides make a significant contribution to the model in the presence of the other, the most appropriate regression model for this data set should include both percentage alcohol and chlorides. SSR( X 1 X 2 ) 16.1122  rY21.2  = 0.3046. 64  27.2241  16.1122 SST  SSR( X 1 and X 2 )  SSR( X 1 X 2 )

FSTAT 

(b)

Holding constant the effect of chlorides, 30.46% of the variation in quality rating can be explained by the variation in percentage alcohol. SSR( X 2 X1 ) 5.3564  rY22.1  = 0.1271. 64  27.2241  5.3564 SST  SSR( X 1 and X 2 )  SSR( X 2 X 1 ) Holding constant the effect of percentage alcohol, 12.71% of the variation in quality rating can be explained by the variation in chlorides.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems clxxxix 14.34

(a)

For X1: SSR( X1 X 2 )  SSR( X1 and X 2 )  SSR( X 2 )  39.9437 – 39.2009 = 0.7428

SSR( X1 X 2 )

0.7428 = 1.44 < 3.84 101.8898 / 197 MSE Do not reject H0. There is insufficient evidence that X1 contributes to a model already containing X2.

FSTAT 

For X2: SSR( X 2 X1 )  SSR( X1 and X 2 )  SSR( X1 )  39.9437 – 0.9542 = 38.9895,

SSR( X 2 X1 )

38.9895 = 75.38 > 3.84. 101.8898 / 197 MSE Reject H0. There is evidence that X2 contributes to a model already containing X1.

FSTAT 

(b)

Because only X2 makes a significant contribution to the model in the presence of the other variable, only that variable should be included in the model. SSR( X 1 X 2 ) 0.7428  rY21.2  = 0.0072. 141.8335  39.9437  0.7428 SST  SSR( X 1 and X 2 )  SSR( X 1 X 2 ) Holding constant the effect of the risk based capital, 0.72% of the variation in ROAA can be explained by the variation in efficiency ratio.

SSR( X 2 X1 )

38.9895 = 0.2768. 141.8335  39.9437  38.9895 SST  SSR( X 1 and X 2 )  SSR( X 2 X 1 ) Holding constant the effect of efficiency ratio 27.68% of the variation in ROAA can be explained by the variation in the risk-based capital.

rY22.1 

14.35

(a)

For X1: SSR( X1 X 2 )  SSR( X1 and X 2 )  SSR( X 2 )  27,662.54  513.2846  27,149.255

SSR( X1 X 2 )

SSR( X 2 X1 )

27,149.255  21.68  FU (1,23)  4.279 with 1 and 23 degrees of MSE 28,802.07 / 23 freedom and   0.05 . Reject H0. There is evidence that the variable X1 contributes to a model already containing X2. For X2: SSR( X 2 X1 )  SSR( X1 and X 2 )  SSR( X1 )  27,662.54  20,667.4  6,995.14 FSTAT 

6,995.14  5.586  FU (1,23)  4.279 with 1 and 23 degrees of MSE 28,802.07 / 23 freedom and   0.05 . Reject H0. There is evidence that the variable X2 contributes to a model already containing X1. Since each independent variable, X1 and X2, makes a significant contribution to the model in the presence of the other variable, the most appropriate regression model for this data set should include both variables. SSR( X 1 X 2 ) 27,149.255 rY21.2   SST  SSR( X 1 and X 2 )  SSR( X 1 X 2 ) 56,464.62  27,662.54  27,149.255 = 0.4852. Holding constant the effect of remote hours, 48.52% of the variation in Y can be explained by the variation in total staff present. FSTAT 

(b)

SSR( X 2 X1 )

6,995.14 SST  SSR( X 1 and X 2 )  SSR( X 2 X 1 ) 56,464.62  27,662.54  6,995.14 = 0.1954. Holding constant the effect of total staff present, 19.54% of the variation in Y can be explained by the variation in remote hours.

rY22.1 

Copyright ©2024 Pearson Education, Inc.


cxc Chapter 16: Time-Series Forecasting

14.36

(a)

For X1: SSR( X1 X 2 )  SSR( X1 and X 2 )  SSR( X 2 )  251,630,055.3805  211,138,564.1929  40, 491, 491.19

SSR( X1 X 2 )

40,491.491.19 = 12.098 > 3.98 244,318,546.0274 / 73 MSE Reject H0. There is sufficient evidence that X1 contributes to a model containing X2. FSTAT 

For X2: SSR( X 2 X1 )  SSR( X1 and X 2 )  SSR( X1 )  251,630,055.3805  112,888,778.5259  138,741,276.9

SSR( X 2 X1 )

138,741,276.9 = 41.45 > 3.98. 244,318,546.0274 / 73 MSE Reject H0. There is sufficient evidence that X2 contributes to a model containing X1. FSTAT 

(b)

Because both variables make a significant contribution to the model in the presence of the other variable, both variables should be included in the model. SSR( X 1 X 2 ) rY21.2  SST  SSR( X 1 and X 2 )  SSR( X 1 X 2 )

40, 491, 491.19  0.142 495,948,601.4079  251,630,055.3805  40, 491, 491.19 Holding constant the effect of the number of new graduates, 14.2% of the variation in full-time jobs can be explained by the variation in total worldwide revenue. 

rY22.1 

SSR( X 2 X 1 ) SST  SSR( X 1 and X 2 )  SSR( X 2 X 1 )

138,741, 276.9  0.3622 495,948,601.4079  251,630,055.3805  138,741, 276.9 Holding constant the effect of total worldwide revenue, 36.22% of the variation in full-time jobs can be explained by the variation in the number of new graduates. 

14.37

(a)

For X1: SSR( X1 X 2 )  SSR( X1 and X 2 )  SSR( X 2 )  8,670, 214.3248  3,083,694.6869  5,586,519.638

SSR( X1 X 2 )

5,586,519.638 = 89.571 > 4.00. 3679806.9172 / 59 MSE Reject H0. There is sufficient evidence that X1 contributes to a model containing X2.

FSTAT 

For X2: SSR( X 2 X1 )  SSR( X1 and X 2 )  SSR( X1 )  8,670, 214.3248  6,600,743.6307  2,069, 470.694

SSR( X 2 X1 )

2,069, 470.694 = 33.18 > 4.00. 3679806.9172 / 59 MSE Reject H0. There is evidence that X2 contributes to a model already containing X1.

FSTAT 

Because both variables make a significant contribution to the model in the presence of the other variable, both variables should be included in the model.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxci 14.37

(b)

rY21.2 

SSR( X 1 X 2 ) SST  SSR( X 1 and X 2 )  SSR( X 1 X 2 )

5,586,519.638  0.6029 12,350,021.2419  8,670, 214.3248  5,586,519.638 Holding constant the effect of age, 60.29% of the variation in asking price can be explained by the variation in house size. 

cont.

rY22.1 

SSR( X 2 X 1 ) SST  SSR ( X 1 and X 2 )  SSR ( X 2 X 1 )

2,069, 470.694  0.3560 12,350,021.2419  8,670, 214.3248  2,069, 470.694 Holding constant the effect of house size 35.60% of the variation in asking price can be explained by the variation in age. 

14.38

(a)

Holding constant the effect of X2, the estimated mean value of the dependent variable will increase by 4 units for each increase of one unit of X1.

(b)

Holding constant the effects of X1, the presence of the condition represented by X2 = 1 is estimated to increase the mean value of the dependent variable by 2 units. t  3.27  t17  2.1098 . Reject H0. The presence of X2 makes a significant contribution to the model.

(c)

14.39

14.40

(a)

First develop a multiple regression model using X1 as the variable for the SAT score and X2 a dummy variable with X2 = 1 if a student had a grade of B or better in the introductory statistics course. If the dummy variable coefficient is significantly different from zero, you need to develop a model with the interaction term X1 X2 to make sure that the coefficient of X1 is not significantly different if X2 = 0 or X2 = 1.

(b)

If a student received a grade of B or better in the introductory statistics course, the student would be estimated to have a grade point average in accounting that is 0.30 greater than a student who had the same SAT score, but did not get a grade of B or better in the introductory statistics course.

(a) (b)

(c)

Yˆ  243.7371  9.2189X 1  12.6967X 2 , where X1 = number of rooms and X2 = neighborhood (east = 0). Holding constant the effect of neighborhood, for each additional room, the selling price is estimated to increase by a mean of 9.2189 thousands of dollars, or $9218.9. For a given number of rooms, a west neighborhood is estimated to increase mean selling price over an east neighborhood by 12.6967 thousands of dollars, or $12,696.7. Yˆ  243.7371  9.2189(9)  12.6967(0)  326.70758 or $326,707.58 $309,560.04  YX  X i  $343,855.11$321,471.44  Y | X  X i  $331,943.71

Copyright ©2024 Pearson Education, Inc.


cxcii Chapter 16: Time-Series Forecasting (d) Normal Probability Plot 15

10

Residuals

5

0 -2

-1.5

-1

-0.5

0

0.5

1

1.5

10

12

2

-5

-10

-15

Z Value

Rooms Residual Plot 15 10 5

Residuals

14.40 cont.

0 -5 -10 -15 0

2

4

6

8

14

Rooms

Based on a residual analysis, the model appears adequate. (e)

(g) (h)

FSTAT = 55.39, p-value is virtually 0. Since p-value < 0.05, reject H0. There is evidence of a significant relationship between selling price and the two independent variables (rooms and neighborhood). For X1: tSTAT = 8.9537, p-value is virtually 0. Reject H0. Number of rooms makes a significant contribution and should be included in the model. For X2: tSTAT = 3.5913, p-value = 0.0023 < 0.05. Reject H0. Neighborhood makes a significant contribution and should be included in the model. Based on these results, the regression model with the two independent variables should be used. 7.0466  1  11.3913 5.2378  2  20.1557

(i)

2 radj  0.851

(f)

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxciii 14.40

(j)

cont.

(k) (l)

(m) (n)

14.41

rY21.2  0.825 . Holding constant the effect of neighborhood, 82.5% of the variation in selling price can be explained by variation in number of rooms. rY22.1  0.431 . Holding constant the effect of number of rooms, 43.1% of the variation in selling price can be explained by variation in neighborhood. The slope of selling price with number of rooms is the same regardless of whether the house is located in an east or west neighborhood. Yˆ  253.95  8.032 X 1  5.90 X 2  2.089 X 1 X 2 . For X1 X2: the p-value is 0.330. Do not reject H0. There is no evidence that the interaction term makes a contribution to the model. The two-variable model in (f) should be used. The real estate association can conclude that the number of rooms and the neighborhood both significantly affect the selling price, but the number of rooms has a greater effect.

PHStat output: Regression Statistics Multiple R 0.5068 R Square 0.2568 Adjusted R Square 0.2415 Standard Error 1.0509 Observations 100 ANOVA df Regression Residual Total

Intercept alcohol Type of Wine

(a) (b)

2 97 99

SS MS F Significance F 37.0257 18.5129 16.7617 0.0000 107.1343 1.1045 144.1600

Coefficients Standard Error t Stat P-value 0.9342 0.8770 1.0652 0.2894 0.4652 0.0820 5.6762 0.0000 -0.2577 0.2102 -1.2258 0.2232

Lower 95% Upper 95% -0.8064 2.6747 0.3025 0.6278 -0.6749 0.1595

Yˆ  0.9342  0.4652 X 1  0.2577 X 2 Holding constant the effect of the type of wine, for each additional % increase in alcohol content, wine quality is estimated to increase by a mean of 0.4652. For a given amount of alcohol content, a white wine is estimated to have a 0.2577 higher mean quality than a red wine.

Copyright ©2024 Pearson Education, Inc.


cxciv Chapter 16: Time-Series Forecasting 14.41

(c)

Yˆ  0.9342  0.4652 10   0.2577 1 = 5.3283 3.2196  YX  X i  7.43715.0184  Y | X  X i  5.6382

cont.

PHStat output: Residual Plot for alcohol 3 2

Residuals

1 0 -1 -2 -3 -4 0

5

10

15

alcohol

Residual Plot for Type of Wine 3 2

Residuals

1 0 -1 -2 -3 -4 0

0.2

0.4

0.6 0.8 Type of Wine

1

1.2

Normal Probability Plot 3 2

Residual

1 0 -1

Residual

-2 -3 -4 -4

(d)

-2

0 Z Value

2

4

Based on a residual analysis, there is not any obvious pattern in the residual plots but the normal probability plot indicates departure from the normality assumption. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxcv 14.41 cont.

(j)

FSTAT  16.7617 with a p-value = 0.0000. Reject H0. There is evidence of a relationship between quality and percentage of alcohol and the type of wine. For X1: tSTAT  5.6762 with a p-value = 0.0000. Reject H0. Alcohol content makes a significant contribution and should be included in the model. For X2: tSTAT  –1.2258 with a p-value = 0.2232. Do not reject H0. The type of wine does not make a significant contribution and should not be included in the model. Only alcohol content should be kept in the model. 0.3025  1  0.6278, –0.6749   2  0.1595 The slope here takes into account the effect of the other predictor variable, type of wine, while the solution for Problem 13.4 did not. r 2  0.2568. So, 25.68% of the variation in quality can be explained by variation in alcohol content and variation in the type of wine. 2 radj  0.2415

(k)

r 2  0.2568 while r 2  0.3417 in Problem 13.16 (a).

(l)

rY21.2  0.2493. Holding constant the effect of wine type, 24.93% of the variation in

(m) (n)

quality can be explained by variation in alcohol content. rY22.1  0.0153. Holding constant the effect of alcohol content, 1.53% of the variation in quality can be explained by variation in wine type. The slope of alcohol content is the same regardless of whether the wine is red or white. PHStat output:

(e) (f)

(g) (h) (i)

Coefficients Standard Error t Stat P-value Intercept 1.6780 1.1448 1.4658 0.1460 alcohol 0.3947 0.1076 3.6667 0.0004 Type of Wine -2.0309 1.7669 -1.1494 0.2533 alcohol X Type of Wine 0.1678 0.1660 1.0107 0.3147

(o) (p) 14.42

(a) (b)

(c)

Lower 95% Upper 95% -0.5944 3.9503 0.1810 0.6083 -5.5382 1.4764 -0.1617 0.4973

Since the tSTAT for the significance of X 1 X 2 has a p-value = 0.3147, do not reject H0. There is not evidence that the interaction term makes a contribution to the model. The one-variable model should be used. Only the alcohol content is significant in predicting the wine quality.

Yˆ  8.0100  0.0052X 1  2.1052X 2 , where X1 = depth (in feet) and X2 = type of drilling (wet = 0, dry = 1). Holding constant the effect of type of drilling, for each foot increase in depth of the hole, the additional drilling time is estimated to increase by a mean of 0.0052 minute. For a given depth, a dry drilling is estimated to reduce mean additional drilling time over wet drilling by 2.1052 minutes. Dry drilling: Yˆ  8.0101  0.0052 100  2.1052=6.4276 minutes. 6.2096  Y | X  X i  6.6457 , 4.9230  YX  X i  7.9322

Copyright ©2024 Pearson Education, Inc.


cxcvi Chapter 16: Time-Series Forecasting 14.42 cont.

(d) Depth Residual Plot 2.5 2 1.5

Residuals

1 0.5 0 -0.5 -1 -1.5 -2 -2.5 0

50

100

150

200

250

300

Depth

(g) (h)

Based on a residual analysis, the model appears adequate. FSTAT = 111.109 with 2 and 97 degrees of freedom, F2,97 = 3.09 using Excel. p-value is virtually 0. Reject H0 at 5% level of significance. There is evidence of a relationship between additional drilling time and the two dependent variables. For X1: tSTAT = 5.0289 > t97 = 1.9847. Reject H0. Depth of the hole makes a significant contribution and should be included in the model. For X2: tSTAT = –14.0331 < t97 = –1.9847. Reject H0. Type of drilling makes a significant contribution and should be included in the model. Based on these results, the regression model with the two independent variables should be used. 0.0032  1  0.0073 2.4029  2  1.8075

(i)

2 radj  0.6899

(j)

rY21.2  0.2068 . Holding constant the effect of type of drilling, 20.68% of the variation in

(e)

(f)

(k) (l)

(m) (n)

14.43

(a) (b)

additional drilling time can be explained by variation in depth of the hole. rY22.1  0.6700 . Holding constant the effect of the depth of the hole, 67% of the variation in additional drilling time can be explained by variation in type of drilling. The slope of additional drilling time with depth of the hole is the same regardless of whether it is a dry drilling hole or a wet drilling hole. Yˆ  7.9120  0.0060X 1  1.9091X 2  0.0015X 1 X 2 . For X1X2: the p-value is 0.4624 > 0.05. Do not reject H0. There is not evidence that the interaction term makes a contribution to the model. The two-variable model in (a) should be used. Both variables affect the drilling time. Dry drilling holes should be used to reduce the drilling time.

Yˆ  2.4512  0.0482X 1  4.5283X 2 , where X1 = amount of cubic feet moved and X2 = is there an elevator in the apartment (yes = 1, no = 0)? Holding constant the effect of elevator in the building, for each cubic foot increase in amount moved, the labor hours are estimated to increase by a mean of 0.0482. For a given amount of cubic feet moved, a building with an elevator is estimated to have a mean labor hours of 4.5283 below an apartment without an elevator. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxcvii 14.43

(c)

Yˆ  2.4512  0.0482  500   4.5283 1 = 22.0254 20.1431  Y | X  X i  23.9078

cont.

12.1150  YX  X i  31.9359 (d) Normal Probability Plot 15

Residuals

10

5

0 -2.5

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

2.5

-5

-10

Z Value

Feet Residual Plot 15

Residuals

10

5

0

-5

-10 0

200

400

600

800

1000

1200

1400

1600

Feet

(e)

(f)

Based on a residual analysis, the errors appear to be normally distributed. The equal variance assumption does not appear to have been violated. The linearity assumption also appears to be intact. FSTAT = 153.3884, p-value is virtually 0. Since p-value < 0.05, reject H0. There is evidence of a significant relationship between labor hours and the two independent variables (the amount of cubic feet moved and whether there is an elevator in the building). For X1: tSTAT = 16.015, p-value is virtually 0. Reject H0. The amount of cubic feet moved makes a significant contribution and should be included in the model. For X2: tSTAT = –2.1521, p-value = 0.0388 < 0.05. Reject H0. The presence of an elevator makes a significant contribution and should be included in the model. Based on these results, the regression model with the two independent variables should be used. Copyright ©2024 Pearson Education, Inc.


cxcviii Chapter 16: Time-Series Forecasting

14.43 cont.

(g)

0.0421  1  0.0543, –8.8091   2  –0.2475

(h)

r 2  0.9029. So 90.29% of the variation in labor hours can be explained by variation in the amount of cubic feet moved and whether there is an elevator in the building. 2 radj  0.8970

(i) (j)

(k) (l)

(m) (n)

14.44

(a)

(b)

14.45

rY21.2  0.8860. Holding constant the effect of the presence of an elevator, 88.6% of the variation in labor hours can be explained by variation in the amount of cubic feet moved. rY22.1  0.1231. Holding constant the effect of the amount of cubic feet moved, 12.31% of the variation in labor hours can be explained by whether there is an elevator in the building. The slope of labor hours with the amount of cubic feet moved is the same regardless of whether there is an elevator in the building. Yˆ  4.7260  0.0573X 1  5.4614X 2  0.0139X 1 X 2 . For X1 X2: the p-value is 0.0257 < 0.05. Reject H0. There is evidence that the interaction term makes a contribution to the model. The interaction model in (l) should be used. Both the amount of cubic feet moved and the presence of an elevator affect labor hours.

Y  2.1698  0.0201X1  0.0156 X 2  0.0006 X1 X 2 , where X1 = efficiency ratio, X2 = total risk-based capital, where p-value = 0.0072 < 0.05. Reject H0. There is evidence that the interaction term makes a contribution to the model. Because there is evidence of an interaction effect between efficiency ratio and growth, the model in (a) should be used. From PHStat Regression Analysis Regression Statistics Multiple R

0.7284

R Square

0.5306

Adjusted R Square

0.5209

Standard Error

14.4536

Observations

100

ANOVA df Regression

SS 2

22906.6581

MS

F

11453.3290

54.8254

Copyright ©2024 Pearson Education, Inc.

Significance F 0.0000


Solutions to End-of-Section and Chapter Review Problems cxcix Residual

97

20263.8519

Total

99

43170.5100

Coefficients

Standard Error

-47.9482

Summed Rating Coded Location

Intercept

208.9057

t Stat

P-value

Lower 95%

11.9934

-3.9979

0.0001

-71.7517

-24.1447

1.7375

0.1890

9.1918

0.0000

1.3623

2.1127

-17.8012

2.9129

-6.1111

0.0000

-23.5826

-12.0199

Copyright ©2024 Pearson Education, Inc.

Upper 95%


cc Chapter 16: Time-Series Forecasting 14.45 cont.

(a) (b)

Y  47.9482  1.7375 X1  17.8012 X 2 . The Y intercept, b0, would be the mean cost per person for a meal when the summary rating and the location are both zero. The literal interpretation is not meaningful in this case because a summary rating of zero would not be logical. For each one unit increase in summary rating, one would estimate that the predicted mean cost per person would increase by $1.7375, while holding presence of metro area location constant. When there is a presence of metro area location, the predicted mean cost per person will decrease by $17.8012, while holding summary rating constant.

(c)

Y  47.9482  1.7375(60)  17.8012(0)  56.3018. From PHStat Confidence Interval Estimate and Prediction Interval Data Confidence Level

95% 1

Summated Rating given value

60

Coded Location given value

0

For Average Predicted Y (YHat) Confidence Interval Lower Limit

52.13595

Confidence Interval Upper Limit

60.46708

For Individual Response Y Prediction Interval Lower Limit

27.31431

Prediction Interval Upper Limit

85.28871

52.13595  Y | X  X i  60.46708 27.31431  YX  X i  85.28871 (d)

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cci

Copyright ©2024 Pearson Education, Inc.


ccii Chapter 16: Time-Series Forecasting 14.45 cont.

(d)

(e)

There is no pattern in the relationship between residuals and the predicted value of Y, the value of the summary rating, or the value of the location. The regression assumptions are satisfied. At the 0.05 significance level, there is evidence of a significant linear relationship between price per person and the independent variables, summary rating and location. Because FSTAT = 54.8254 or p-value = 0.0000, reject H0.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cciii 14.45 cont.

(f)

From PHStat

Coefficients

Standard Error

t Stat

-47.9482

11.9934

Summary Rating

1.7375

Coded Location

-17.8012

Intercept

(g) (h)

P-value

Lower 95%

Upper 95%

-3.9979

0.0001

-71.7517

-24.1447

0.1890

9.1918

0.0000

1.3623

2.1127

2.9129

-6.1111

0.0000

-23.5826

-12.0199

At the 0.05 significance level, there is evidence of linear relationship between summary rating and the cost of a meal. Because tSTAT = 9.19 or p-value = 0.000, reject H0 and include summary rating in the model. At the 0.05 significance level, there is evidence of linear relationship between presence of a restaurant in the metro area and the cost of a meal. Because tSTAT = –6.11 or p-value = 0.0000, reject H0 and include location in the model. Based on these results both summary rating and location should be included in the model. 1.3623  1  2.1127 For Problem 13.5, the sample slope is 1.5951. For (b) of 14.45, the sample slope for summary rating is 1.7375 and –17.8012 for location. The slope from 14.45 (b) is higher due to the addition of the location dummy variable. Because the two variables are not independent, removal of the location dummy variable results in a reduction of the summated rating slope for the model in 13.5. From PHStat Regression Analysis Regression Statistics

(i) (j) (k)

Multiple R

0.7284

R Square

0.5306

Adjusted R Square

0.5209

r2 = 0.5306. Thus, 53.06% of the variation in price per person can be explained by the variation in summary rating and the location of the restaurant. 2 radj = 0.5209. The adjusted r2 takes into account the number of independent variables and the sample size. For Problem 13.17, r2 = 0.3499, which means that 34.49% of the variation in the dependent variable can be explained by the independent variable. For 14.45, r2 = 0.5306, which means that 53.06% of the variation in the dependent variable can be explained by the variation in summated rating and location. From PHStat: Regression Analysis Coefficients of Partial Determination Copyright ©2024 Pearson Education, Inc.


cciv Chapter 16: Time-Series Forecasting Coefficients

(l)

14.45 cont.

(m) (n)

r2 Y1.2

0.465534817

r2 Y2.1

0.277980717

rY21.2  0.4655. Holding the effect of location constant, 46.5% of variation in price per person can be explained by the variation in summary rating. rY22.1  0.2780. Holding the effect of summary rating constant, 27.80% of variation in price per person can be explained by the variation in restaurant location. The slope of price per person on summary rating is the same irrespective of the location of the restaurant. From PHStat, with Summary Rating * Location Regression Analysis Regression Statistics Multiple R

0.7306

R Square

0.5337

Adjusted R Square

0.5192

Standard Error

14.4801

Observations

100

ANOVA df

SS

MS

F 36.6312

Regression

3

23041.8093

7680.6031

Residual

96

20128.7007

209.6740

Total

99

43170.5100

Coefficients

Standard Error

t Stat

P-value

-55.1287

14.9786

-3.6805

0.0004

Summary Rating

1.8523

0.2373

7.8049

0.0000

Coded Location

2.3450

25.2623

0.0928

0.9262

Rating*Location

-0.3161

0.3937

-0.8029

0.4240

Intercept

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccv For the interaction term, summary rating*location, tSTAT = –0.8029 with a p-value of 0.4240. Because p-value > 0.05, do not reject H0. There is not sufficient evidence that the interaction term makes a significant contribution. (o)

(p)

14.46

(a)

(b)

On the basis of (f) and (n), the most appropriate model would include both summary rating and location as independent variables. The interaction term would not be included in this model because it did not make a significant contribution to the model at the 0.05 significance level. The most appropriate model: Y  47.9482  1.7375 X1  17.8012 X 2 . Both the summary rating and the location of the restaurant contribute to the cost of a meal. The model developed in 14.45 indicates that 53.06% of the variation in price per person can be explained by the variation in summary rating and the variation in location of the restaurant.

Y  368.2348  36.4269 X1  1.8282 X 2  0.0186 X1 X 2 , where X1 = worldwide revenue, X2 = number of new graduates, where p-value =0.01832 < 0.05. Reject H0. So the term in (a) is significant and should be included in the model. Because there is evidence of an interaction effect, the model in (a) should be used.

Copyright ©2024 Pearson Education, Inc.


ccvi Chapter 16: Time-Series Forecasting 14.47

(a) Coefficients Standard Error t Stat P-value Intercept 7.5904 3.5598 2.1323 0.0384 alcohol -0.1321 0.3430 -0.3850 0.7020 chlorides -99.9904 47.0361 -2.1258 0.0389 alcohol X chlorides 8.8772 4.6077 1.9266 0.0602

(b)

14.48

(a)

(b)

14.49

For X1X2: the p-value is 0.0602 > 0.05. Do not reject H0. There is not enough evidence that the interaction term makes a contribution to the model. Since there is not enough evidence of an interaction effect between percentage alcohol and chlorides, the model in problem 14.5 should be used.

Yˆ  250.4237  0.0127X 1  1.4785X 2  0.004X 3 . where X1 = staff present, X2 = remote hours, X3 = X1 X2 For X1X2: the p-value is 0.2353 > 0.05. Do not reject H0. There is not enough evidence that the interaction term makes a contribution to the model. Since there is not enough evidence of an interaction effect between total staff present and remote hours, the model in problem 14.7 should be used.

(a) Intercept Proficiency Classroom Online

(b)

(c) (d)

Coefficients Standard Error -63.9813 16.7997 1.1258 0.1589 -22.2887 4.3154 8.0880 4.3103

t Stat P-value -3.8085 0.0008 7.0868 0.0000 -5.1649 0.0000 1.8765 0.0719

where X1 = proficiency exam, X2 = classroom dummy, X3 = online dummy Holding constant the effect of training method, for each point increase in proficiency exam score, the end-of-training exam score is estimated to increase by a mean of 1.1258 points. For a given proficiency exam score, the end-of-training exam score of a trainee who has been trained by the classroom method will have an estimated mean score that is 22.2887 points below a trainee that has been trained using the courseware app method. For a given proficiency exam score, the end-of-training exam score of a trainee who has been trained by the online method will have an estimated mean score that is 8.0880 points above a trainee that has been trained using the courseware app method Yˆ  63.9813  1.1258(100)  48.5969

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccvii Proficiency

Residual Plot

25 20

Residuals

15 10 5 0 -5 -10 -15 -20 -25 0

20

40

60

80

100

120

140

Proficiency

(d) Residuals vs Predicted Y 25 20

Residuals

15 10 5 0 -5 0

20

40

60

80

100

-10 -15 -20 -25

Predicted Y

There appears to be a quadratic effect from the residual plots. Normal Probability Plot 25 20 15

Residuals

14.49 cont.

10 5 0 -3 -5

-2

-1

0

1

2

3

-10 -15 -20 -25

Z Value

(e)

(f)

There is no severe departure from the normality assumption from the normal probability plot. FSTAT = 31.77 with 3 and 26 degrees of freedom. The p-value is virtually 0. Reject H0 at 5% level of significance. There is evidence of a relationship between end-of-training exam score and the independent variables. For X1: tSTAT = 7.0868 and the p-value is virtually 0. Reject H0. Proficiency exam score makes a significant contribution and should be included in the model. Copyright ©2024 Pearson Education, Inc.


ccviii Chapter 16: Time-Series Forecasting

(g) (h) (i)

For X2: tSTAT = –5.1649 and the p-value is virtually 0. Reject H0. The classroom dummy makes a significant contribution and should be included in the model. For X3: t  1.8765 and the p-value = 0.07186. Do not reject H0. There is not sufficient evidence to conclude that there is a difference in the online method and the courseware app method on the mean end-of-training exam scores. Base on the above result, the regression model should use the proficiency exam score and the classroom dummy variable. 0.7992  1  1.4523 , 31.1591   2  13.4182 , 0.7719  3  16.9480

r 2  0.7857 . 78.57% of the variation in the end-of-training exam score can be explained by the proficiency exam score and the various training methods. 2 radj  0.7610

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccix 14.49 cont.

(j)

(k) (l)

(m)

14.50

(a) (b) (c) (d)

14.51

(a) (b) (c) (d)

rY21.23  0.6589 . Holding constant the effect of training method, 65.89% of the variation in end-of-training exam score can be explained by variation in the proficiency exam score. rY22.13  0.5064 . Holding constant the effect of proficiency exam score, 50.64% of the variation in end-of-training exam score can be explained by the difference between classroom and courseware app methods. rY23.12  0.1193 . Holding constant the effect of proficiency exam score, 11.93% of the variation in end-of-training exam score can be explained by the difference between online and courseware app methods. The slope of end-of-training exam score with proficiency score is the same regardless of the training method. Let X4 = X1X2, X5 = X1X3. H 0 :  4  5  0 There is no interaction among X1 , X2 and X3. H1 : At least one of  4 and  5 is not zero. There is interaction among at least a pair of X1 , X2 and X3. SSR  X 4 , X 5 | X 1 , X 2 , X 3   SSR  X 1 , X 2 , X 3 , X 4 , X 5   SSR  X 1 , X 2 , X 3   / 2 FSTAT   MSE  X 1 , X 2 , X 3 , X 4 , X 5  MSE  X 1 , X 2 , X 3 , X 4 , X 5  = 0.8122. The p-value = 0.46 > 0.05. Do not reject H0. The interaction terms do not make a significant contribution to the model. The regression model should use the proficiency exam score and the classroom dummy variable. Predicted Yˆ  7  2 X 1i  3 X 12i  7  2(2)  3(22 )  23 . tSTAT  2.35  t /2  2.0518 with 27 degrees of freedom. Reject H0. The quadratic term is significant. The quadratic model is better than the linear model. tSTAT  1.17  t /2  2.0518 with 27 degrees of freedom. Do not reject H0. The quadratic term is not significant. The quadratic model is not better than the linear model. Predicted Yˆ  7  3.0 X 1i  3 X 12i  7  3.0(2)  3(22 )  13 . Predicted Yˆ  7  2 X 1i  1.5 X 12i  7  2(3)  1.5(32 )  26.5 . tSTAT  2.35  t /2  2.0518 with 27 degrees of freedom. Reject H0. The quadratic term is significant. The quadratic model is better than the linear model. tSTAT  1.17  t /2  2.0518 with 27 degrees of freedom. Do not reject H0. The quadratic term is not significant. The quadratic model is not better than the linear model. Predicted Yˆ  7  3 X 1i  1.5 X 12i  7  3(2)  1.5(22 )  7 .

Copyright ©2024 Pearson Education, Inc.


ccx Chapter 16: Time-Series Forecasting (a) GPA 2 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9

Predicted HOCS 2.8600 3.0342 3.1948 3.3418 3.4752 3.5950 3.7012 3.7938 3.8728 3.9382

GPA 3 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4

Predicted HOCS 3.9900 4.0282 4.0528 4.0638 4.0612 4.0450 4.0152 3.9718 3.9148 3.8442 3.7600

(b)

Predicted HOCS

HOCS

14.52

4.50 4.00 3.50 3.00 2.50 2.00 1.50 1.00 0.50 0.00 0

1

2

3

4

5

GPA (c)

(d)

The curvilinear relationship suggests that HOCS increases at a decreasing rate. It reaches its maximum value of 4.0638 at GPA = 3.3 and declines after that as GPA continues to increase. An r2 of 0.07 and an adjusted r2 of 0.06 tell you that GPA has very low explanatory power in explaining the variation in HOCS. You can tell that the individual HOCS scores will have scattered quite widely around the curvilinear relationship plotted in (b) and discussed in (c).

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccxi 14.53

(a)

(b) (c) (d)

Ŷ  59.6691  0.0328 X  0.000005029 X 2 2 Yˆ  59.6691  0.0328  3000   0.000005029  3000  = 112.6782

Copyright ©2024 Pearson Education, Inc.


ccxii Chapter 16: Time-Series Forecasting 14.53 cont.

(d)

(e)

(f)

(g) (h) (i)

The residuals plot does not reveal any non-linearity. The normal probability plot does not indicate any severe departure from the normality assumption. H0: 1   2  0 H1: At least one  j is not 0. FSTAT = 3192.8738 with a p-value of virtually zero. Reject H0. The overall quadratic relationship is significant. H0:  2  0H1:  2  0 tSTAT = –54.9089 with a p-value = 0.0000 < 0.05. Reject H0. At the 0.05 level of significance, the quadratic model is better than the linear model. r 2  0.9584. So, 95.84% of the variation in torque can be explained by the quadratic relationship between torque and RPM. 2 radj  0.9581. Toque depend quadratically on RPM and 95.84% of the variation in torque can be explained by the quadratic relationship between torque and RPM.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccxiii 14.54

(a)

From PHStat: Domestic Beer, Calories vs Alcohol and Carbohydrates Regression Analysis Regression Statistics Multiple R

0.9835

R Square

0.9673

Adjusted R Square

0.9668

Standard Error

7.9735

Observations

157

ANOVA df

SS

MS

Regression

2

Residual

154

9790.8159

Total

156

299125.4268

F

289334.6109 144667.3054 2275.4758 63.5767

Coefficients

Standard Error

t Stat

P-value

Intercept

-5.1828

2.5746

-2.0130

0.0459

Alcohol

21.5146

0.5613

38.3308

0.0000

Carbohydrates

3.9387

0.1526

25.8068

0.0000

Yˆ  5.182  21.5146 X 1  3.9387X 2 , where X1 = alcohol % and X2 = carbohydrates. FSTAT  2,275.4758 , p-value = 0.0000 < 0.05, so reject H0. At the 5% level of significance, the linear terms are significant together. (b)

From PHStat: Domestic Beer, Calories vs Alcohol, Carbohydrates, Alcohol Squared, and Carbohydrates Squared Regression Analysis Regression Statistics

Copyright ©2024 Pearson Education, Inc.


ccxiv Chapter 16: Time-Series Forecasting Multiple R

0.9842

R Square

0.9686

Adjusted R Square

0.9678

Standard Error

7.8617

Observations

157

ANOVA df

SS

Regression

4

289730.9290

Residual

152

9394.4978

Total

156

299125.4268

MS

F

72432.7322 1171.9387 61.8059

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccxv 14.54 cont.

(b)

From PHStat: Domestic Beer, Calories vs Alcohol, Carbohydrates, Alcohol Squared, and Carbohydrates Squared Coefficients

Standard Error

t Stat

P-value

Intercept

10.9881

8.2218

1.3365

0.1834

Alcohol

14.5795

2.9323

4.9721

0.0000

Carbohydrates

4.7076

0.4700

10.0171

0.0000

Alcohol Sq

0.5227

0.2163

2.4166

0.0169

CarbSq

-0.0257

0.0164

-1.5667

0.1193

Yˆ  10.9881  14.5795 X 1  4.7076X 2  0.5227 X 12  0.0257 X 22 , where X1 = alcohol % and X2 = carbohydrates. (c)

For the model in (b) that includes the quadratic terms, FSTAT  1,171.9387 , or p-value = 0.0000 < 0.05, so reject H0. At the 5% level of significance, the model with quadratic terms are significant. For the quadratic alcohol % term, tSTAT  2.4166 , and the p-value = 0.2163. Reject H0. There is enough evidence that the quadratic term for alcohol % is significant at the 5% level of significance. For the quadratic carbohydrate term, tSTAT  1.5667 , and the p-value = 0.1193. Do not reject H0. There is insufficient evidence that the quadratic term for carbohydrates is significant at the 5% level of significance. Hence, because the quadratic term for alcohol % is significant, the model in (b) that includes this term is better.

(d)

The number of calories in a beer depends quadratically on the alcohol percentage but linearly on the number of carbohydrates. The alcohol percentage and number of carbohydrates explain about 96.73% of the variation in the number of calories in a beer.

Copyright ©2024 Pearson Education, Inc.


ccxvi Chapter 16: Time-Series Forecasting 14.55

(a)

(b)

Yˆ  1003.9000+ 6.2937X 1  0.0098X 12 Intercept Temperature Temperature Sq

Coefficients Standard Error t Stat P-value -1003.9000 473.7367 -2.1191 0.1014 6.2937 3.1635 1.9895 0.1175 -0.0098 0.0053 -1.8587 0.1366

(c)

(d)

(e)

There is no obvious pattern in the residual plot and normal probability plot. The model appears to be adequate. H 0 : 2  0 vs. H1 :  2  0 Since the p-value = 0.1366 > 0.05, do not reject H0. There is not a significant quadratic relationship between temperature and registration error. Since the quadratic term is not significant at the 5% level, the linear model is a better fit than the quadratic regression model.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccxvii 14.55 cont.

(e)

(f) (g) (h)

r2 = 0.8216. So, 82.16% of the variation in registration error can be explained by the variation in temperature. Adjusted r2 = 0.7859. There is a strong linear relationship between registration error and temperature. Registration error depends linearly on temperature and 82.16% of the variation in registration error can be explained by the variation in temperature.

Copyright ©2024 Pearson Education, Inc.


ccxviii Chapter 16: Time-Series Forecasting 14.56

(a)

(b)

Yˆ  18030  1813 X  63.2 X 2 Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccxix 14.56 cont.

(c)

Yˆ  18030  1813(5)  63.2(5)2  $10,545.6 (d)

(e) (f)

There are no patterns in the residual plots. There does not appear to be any violation of assumptions. Because FSTAT = 243.51 or p-value = 0.000, reject H0. p-value = 0.000. The probability of FSTAT = 243.51 or higher is 0.000, given the null hypothesis is true. Copyright ©2024 Pearson Education, Inc.


ccxx Chapter 16: Time-Series Forecasting 14.56 cont.

(g) (h) (i) (k)

14.57

Because tSTAT = 4.86 or p-value = 0.000, reject H0. The probability of tSTAT < –4.86 or > 4.86 is 0.000, given the null hypothesis is true. r2 = 0.9312. 93.12% of the variation in price can be explained by the quadratic relationship between age and price. (j) adjusted r2 = 0.9273. There is a strong quadratic relationship between age and price.

(a)

(b)

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccxxi 14.57 cont.

(b)

From PHStat Regression Analysis Regression Statistics Multiple R

0.5882

R Square

0.3460

Adjusted R Square

0.2975

Standard Error

7989.3501

Observations

30

ANOVA df

SS

MS

Regression

2

911627846.6386

455813923.3193 7.1411

Residual

27

1723402298.5881

63829714.7625

Total

29

2635030145.2267

Coefficients

Standard Error

Intercept

2704.3781

Tourism Establishments Establishments^2

F

t Stat

P-value

1848.3824

1.4631

0.1550

335.4417

108.9968

3.0775

0.0047

-1.1854

0.5293

-2.2396

0.0335

Yˆ  2,704.3781  335.4417 X  1.1854 X 2 (c)

Yˆ  20.2  28.69 X  0.1052 X 2 When X = 3, Yˆ  2,704.3781  335.4417(3)  1.1854(3)2  3,700.035 thousands. For a country with 3,000 tourist establishments, the predicted mean number of jobs generated in the travel and tourism industry is 3,700,035.

Copyright ©2024 Pearson Education, Inc.


ccxxii Chapter 16: Time-Series Forecasting 14.57 cont.

(d)

(e)

(f)

A plot of the residuals against the values of the independent variable, number of tourism establishments, reveals a potential violation of the equal variance assumption. A normal probability plot reveals a potential violation of the normality assumption. At the 0.05 significance level there is evidence of a significant overall relationship between the number of jobs generated in the travel industry and the number of tourist establishments. Because FSTAT = 7.1411 or p-value = 0.0047, reject H0. The p-value of 0.0047 indicates that the probability of observing an FSTAT of 7.1411 or greater is 0.0047 when H0 is true. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccxxiii 14.57 cont.

(g) (h)

(i) (j)

14.58

(a) (b)

There is sufficient evidence that the quadratic model is significant at the 0.05 level. Because tSTAT = –2.2396 or p-value = 0.0335, reject H0. r2 = 0.3460. Thus, 34.60% of the variation in the number of jobs generated in the travel and tourism industry can be explained by the quadratic relationship between the number of tourism establishments. r2adj = 0.2975. The adjusted r2 takes into account the number of independent variables and the sample size. There is a significant quadratic relationship between the number of jobs generated in the travel and tourism industry in 2021 and the number of establishments that provide overnight accommodation for tourists. However, the results should be interpreted with caution given the potential violations in the equal variance and normality assumptions.

log Yˆ  log(3.07)  0.9log(8.5)  1.41log(5.2)  2.33318 Yˆ  102.33318  215.37 Holding constant the effects of X2, for each additional unit of the logarithm of X1, the logarithm of Y is estimated to increase by a mean of 0.9. Holding constant the effects of X1, for each additional unit of the logarithm of X2, the logarithm of Y is estimated to increase by a mean of 1.41.

14.59

(a) (b)

ln Yˆ  4.62  0.5(8.5)  0.7(5.2)  12.51 Yˆ  e12.51  271,034.12 Holding constant the effects of X2, for each additional unit of X1, the natural logarithm of Y is estimated to increase by a mean of 0.5. Holding constant the effects of X1, for each additional unit of X2, the natural logarithm of Y is estimated to increase by a mean of 0.7.

14.60

(a)

From PHStat: sqrt (Calories) vs Alcohol and Carbohydrates Yˆ  6.2596  0.7755 X  0.1672 X , where X1 = alcohol %, and X2 = carbohydrates 1

2

Regression Analysis Regression Statistics Multiple R

0.9784

R Square

0.9573

Adjusted R Square

0.9568

Standard Error

0.3523

Observations

157

ANOVA df Regression

SS 2

428.5147

MS

F

214.2574

1726.7142

Copyright ©2024 Pearson Education, Inc.


ccxxiv Chapter 16: Time-Series Forecasting

14.60 cont.

Residual

154

19.1089

Total

156

447.6236

0.1241

Coefficients

Standard Error

t Stat

P-value

Intercept

6.2596

0.1137

55.0329

0.0000

Alcohol

0.7755

0.0248

31.2762

0.0000

Carbohydrates

0.1672

0.0067

24.7987

0.0000

(a)

(b)

The normal probability plot of the linear model showed departure from a normal distribution, so a square-root transformation of calories was done. FSTAT = 1,726.7142. Because the p-value = 0.000, reject H0 at the 5% level of significance. There is evidence of a significant linear relationship between the square root of calories and the percentage of alcohol and the number of carbohydrates.

(c)

r2 = 0.9573. This indicates that 95.73% of the variation in the square root of calories can be explained by the variation in the percentage of alcohol and the variation in the number of carbohydrates.

(d)

Adjusted r2 = 0.9568.

(e)

The model in 14.60 is better because the residual plot is not right skewed.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccxxv 14.61

(a)

From PHStat: natlog (Calories) vs Alcohol and Carbohydrates LnYˆ  4.0613  0.1143 X 1  0.0288 X 2 , where X1 = alcohol %, and X2 = carbohydrates Regression Analysis Regression Statistics Multiple R

0.9609

R Square

0.9233

Adjusted R Square

0.9223

Standard Error

0.0760

Observations

157

ANOVA df

SS

MS

Regression

2

10.7108

5.3554

Residual

154

0.8897

0.0058

Total

156

11.6005

F 926.9479

Coefficients

Standard Error

t Stat

Intercept

4.0613

0.0245

165.4771

0.0000

Alcohol

0.1143

0.0054

21.3612

0.0000

Carbohydrates

0.0288

0.0015

19.7964

0.0000

(b)

Copyright ©2024 Pearson Education, Inc.

P-value


ccxxvi Chapter 16: Time-Series Forecasting

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccxxvii 14.61 cont.

(b)

The residual plots for percentage of alcohol and number of carbohydrates reveals some potential remaining non-linearity in the transformed dependent variable. The normal probability plot reveals no evidence of a violation of the normality assumption, except for three outlying residuals on the negative side. (c)

At the 0.05 level of significance, there is evidence of significant overall relationship between the natural logarithm of calories and the percentage of alcohol and the number of carbohydrates. Because FSTAT = 926.9479 or p-value = 0.0000, reject H0. Copyright ©2024 Pearson Education, Inc.


ccxxviii Chapter 16: Time-Series Forecasting 14.61 cont.

14.62

(d)

r2 = 0.9233. This indicates that 92.33% of the variation in the natural logarithmic transformation of calories can be explained by the variation in the percentage of alcohol and the variation in the number of carbohydrates.

(e)

Adjusted r2 = 0.9223.

(f)

The models in 14.54 (r2 = 0.9673) and 14.60 (r2 = 0.9573) can explain more variation in the dependent variable because they had slightly higher r2 values compared to the present model. The model in 14.54 would be the best model because it had the highest r2 value.

(a) (b) (c)

Predicted ln(Price) = 9.7771 – 0.10218 Age. $10,574.92

(d)

There is no evidence of violations of assumptions. The model is adequate. tSTAT = 19.48 or p-value = 0.000, reject H0. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccxxix 14.62 cont.

(e) (f) (g)

14.63

(a)

(b) (c)

0.9112. 91.12% of the variation in the natural log of price can be explained by the age of the auto. 0.9088. Choose the model from Problem 15.6. That model has a higher adjusted r2 of 92.73%.

Yˆ = 128.29 – 4.798(X1)

Ŷ = (128.29 – 4.798(5))2 = $10,878.49

Copyright ©2024 Pearson Education, Inc.


ccxxx Chapter 16: Time-Series Forecasting The residual plot of residuals versus age appears to reveal a quadratic pattern, which indicates remaining nonlinearity after the square root transformation. 14.63 cont.

(c)

(d)

(e) (f)

(g)

The normal probability plot reveals no sufficient evidence of a violation of the normality assumption. At the 0.05 significance level, there is evidence of significant overall relationship between the square root of price and age of vehicle. Because FSTAT = 392.73 or p-value = 0.000, reject H0. r2 = 0.9139. This indicates that 91.39% of the of the variation in the square root transformation of price can be explained by the variation in the age of a vehicle. Adjusted r2 = 0.9116. This indicates that 91.16% of the of the variation in the square root transformation of price can be explained by the variation in the age of a vehicle after adjusting for the number of independent variables and the sample size. The models in 15.6 and 15.12 can explain more variation in the dependent variable because they had slightly higher r2 values compared to the present model. The models in 15.6 and 15.12 had the same adjusted r2 value of 0.9273. Both the 15.6 model and the 15.12 model would be better than the present model.

14.64

r2 represents the proportion of the variation in Y that is explained by the set of explanatory variables selected. Adjusted r2 takes into account both the number of explanatory variables in the model and the sample size.

14.65

In the case of the simple linear regression model, the slope b1 represents the change in the estimated mean of Y per unit change in X and does not take into account any other variables. In the multiple linear regression model, the slope b1 represents the change in the estimated mean of Y per unit change in X1, taking into account the effect of all the other independent variables.

14.66

Testing the significance of the entire regression model involves a simultaneous test of whether any of the independent variables are significant. Testing the contribution of each independent variable tests the contribution of that independent variable after accounting for the effect of the other independent variables in the model. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccxxxi

Copyright ©2024 Pearson Education, Inc.


ccxxxii Chapter 16: Time-Series Forecasting 14.67

The coefficient of partial determination measures the proportion of variation in Y explained by a particular X variable holding constant the effect of the other independent variables in the model. The coefficient of multiple determination measures the proportion of variation in Y explained by all the X variables included in the model.

14.68

Dummy variables are used to represent categorical independent variables in a regression model. One category is coded as 0 and the other category of the variable is coded as 1.

14.69

You test whether the interaction of the dummy variable and each of the independent variables in the model make a significant contribution to the regression model.

14.70

You will want to include an interaction term in a regression model if the effect of an independent variable on the response variable is dependent on the value of a second independent variable.

14.71

It is assumed that the slope of the dependent variable Y with an independent variable X is the same for each of the two levels of the dummy variable.

14.72

When a regression analysis fails to yield a suitable linear model, a nonlinear model that expresses a curvilinear relationship, a quadratic regression model may be suitable.

14.73

(a)

Yˆ  3.888  1.449 X1  1.462 X 2  0.190  X1 X 2 

= 3.888  1.449  2   1.462  2  0.190  2 2 = 1.174

(b)

Yˆ  3.888  1.449 X1  1.462 X 2  0.190  X1 X 2 

= 3.888  1.449  2   1.462  7   0.190  2 7  = 6.584

(c)

Yˆ  3.888  1.449 X1  1.462 X 2  0.190  X1 X 2 

= 3.888  1.449  7   1.462  2  0.190  7  2 = 6.519

(d)

Yˆ  3.888  1.449 X1  1.462 X 2  0.190  X1 X 2 

= 3.888  1.449  7   1.462  7   0.190  7  7  = 7.179

(e)

Yˆ  3.888  1.449 X1  1.462 X 2  0.190  X1 X 2 

= 3.888  1.449 X1  1.462  2   0.190  X1  2 

(f)

= 0.964  1.069X1 The slope of X1 is 1.069. Yˆ  3.888  1.449 X  1.462 X  0.190  X X  1

2

1

2

= 3.888  1.449 X1  1.462  7   0.190  X1  7 

(g)

= 6.346  0.119X1 The slope of X1 is 0.119. Yˆ  3.888  1.449 X  1.462 X  0.190  X X  1

2

1

2

= 3.888  1.449  2   1.462 X 2  0.190  2  X 2  = 0.99  1.082X 2 The slope of X 2 is 1.082.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccxxxiii 14.73

(h)

= 3.888  1.449  7   1.462 X 2  0.190  7  X 2 

cont.

(i)

14.74

Yˆ  3.888  1.449 X1  1.462 X 2  0.190  X1 X 2 

(a) (b)

(c) (d)

(e)

(f)

(g) (h) (i)

(j)

(k)

= 6.255  0.132X 2 The slope of X 2 is 0.132. Since the interaction between X1 and X 2 is negative, a higher value of the perceived quality of the product, X1 , will attenuate the effect of the perceived value of the product, X 2 , on the predicted value of purchasing behavior. Likewise, a higher value of the perceived value of the product, X 2 , will attenuate the effect of the perceived quality of the product, X1 , on the predicted value of purchasing behavior.

Y  3.9152  0.0319 X1  4.2228 X 2 , where X1 = number cubic feet moved and X2 = number of pieces of large furniture. Holding constant the number of pieces of large furniture, for each additional cubic foot moved, the mean labor hours are estimated to increase by 0.0319. Holding constant the amount of cubic feet moved, for each additional piece of large furniture, the mean labor hours are estimated to increase by 4.2228.

Y  20.4926 Based on a residual analysis, the errors appear to be normally distributed. The equalvariance assumption might be violated because the variances appear to be larger around the center region of both independent variables. There might also be violation of the linearity assumption. A model with quadratic terms for both independent variables might be fitted. FSTAT = 228.80, p-value is virtually 0 < 0.05, reject H0. There is evidence of a significant relationship between labor hours and the two independent variables (the amount of cubic feet moved and the number of pieces of large furniture). The p-value is virtually 0. The probability of obtaining a test statistic of 228.80 or greater is virtually 0 if there is no significant relationship between labor hours and the two independent variables (the amount of cubic feet moved and the number of pieces of large furniture). r2 = 0.9327. 93.27% of the variation in labor hours can be explained by variation in the number of cubic feet moved and the number of pieces of large furniture. 2 r adj  0.9287 For X1: tSTAT = 6.9339, the p-value is virtually 0. Reject H0. The number of cubic feet moved makes a significant contribution and should be included in the model. For X2: tSTAT = 4.6192, the p-value is virtually 0. Reject H0. The number of pieces of large furniture makes a significant contribution and should be included in the model. Based on these results, the regression model with the two independent variables should be used. For X1: tSTAT = 6.9339, the p-value is virtually 0. The probability of obtaining a sample that will yield a test statistic greater than 6.9339 is virtually 0 if the number of cubic feet moved does not make a significant contribution, holding the effect of the number of pieces of large furniture constant. For X2: tSTAT = 4.6192, the p-value is virtually 0. The probability of obtaining a sample that will yield a test statistic greater than 4.6192 is virtually 0 if the number of pieces of large furniture does not make a significant contribution, holding the effect of the amount of cubic feet moved constant. 0.0226  1  .0413 Copyright ©2024 Pearson Education, Inc.


ccxxxiv Chapter 16: Time-Series Forecasting 14.74 cont.

(l)

(m)

14.75

rY21.2 = 0.5930. Holding constant the effect of the number of pieces of large furniture, 59.3% of the variation in labor hours can be explained by variation in the amount of cubic feet moved. rY22.1 = 0.3927. Holding constant the effect of the number of cubic feet moved, 39.27% of the variation in labor hours can be explained by variation in the number of pieces of large furniture. Both the number of cubic feet moved and the number of large pieces of furniture are useful in predicting the labor hours, but the cubic feet moved is more important.

From PHStat, Wins vs field goal percentage and three-point percentage Coefficients

Standard Error

Intercept

-216.1443

Field Goal Percentage Three-Point Percentage (a) (b)

(c) (d)

t Stat

P-value

49.1568

-4.3970

0.0002

237.7929

133.8594

1.7764

0.0869

416.9878

140.5164

2.9675

0.0062

Yˆ  216.1443  237.7929 X 1  416.9878 X 2 where X1 = field goal and X2 = 3-point For a given three-point field goal %, each increase of 1% in field goal % increases the estimated mean number of wins by 1.78. For a given field goal %, each increase of 1% in three-point field goal % increases the estimated mean number of wins by 2.97. Yˆ  216.1443  237.7929(0.45)  416.9878(0.35)  36.8082 wins

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccxxxv 14.75 cont.

(d)

Residual analysis does not reveal any potential violation of the regression assumptions.

Copyright ©2024 Pearson Education, Inc.


ccxxxvi Chapter 16: Time-Series Forecasting 14.75

(e)

H 0 : 1  2  0 H1 : Not all  j = 0 for j = 1, 2

cont. Regression Analysis

Regression Statistics Multiple R

0.7303

R Square

0.5334

Adjusted R Square

0.4988

Standard Error

8.1729

Observations

30

ANOVA df

(f) (g) (h) (i)

SS

MS

F

Regression

2

2061.4855 1030.7427 15.4313

Residual

27

1803.4812

Total

29

3864.9667

66.7956

Coefficients

Standard Error

t Stat

P-value

Intercept

-216.1443

49.1568

-4.3970

0.0002

Field Goal Percentage

237.7929

133.8594

1.7764

0.0869

Three-Point Percentage

416.9878

140.5164

2.9675

0.0062

FSTAT = 15.4313 with p-value = 0.0002. Since the p-value < 0.05, reject H0 at 5% level of significance. There is evidence of a significant linear relationship between number of wins and the two explanatory variables. p-value is 0.0002. The probability of obtaining an F test statistic equal to or larger than 15.4313 is 0.0002 if H 0 is true. r2 = SSR/SST = 0.5334. So, 53.34% of the variation in number of wins can be explained by variation in field goal % and three-point field goal %. Adjusted r2 = 0.4988. For X1: tSTAT  b1 / Sb1 = 1.7764 and p-value = 0.0869 > 0.05, do not reject H0. There is insufficient evidence that the variable X1 contributes to a model already containing X2. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccxxxvii For X2: tSTAT  b2 / Sb2 = 2.9675 and p-value = 0.0062 < 0.05, reject H0. There is evidence

(j)

that the variable X2 contributes to a model already containing X1. X2 should be included in the model. For X1: p-value = 0.0869. The probability of obtaining a t test statistic that differs from 0 by 1.7764 or more in either direction is 3.3% if X1 is insignificant. For X2: p-value = 0.022. The probability of obtaining a t test statistic that differs from 0 by 2.9675 or more in either direction is 2.2% if X2 is insignificant.

Copyright ©2024 Pearson Education, Inc.


ccxxxviii Chapter 16: Time-Series Forecasting 14.75 cont.

(j)

From PHStat, Regression Analysis Coefficients of Partial Determination Coefficients

(k)

(l)

14.76

(a)

r2 Y1.2

0.104647873

r2 Y2.1

0.24594251

rY21.2  0.1046. Holding constant three-point field goal %, 10.46% of the variation in number of wins can be explained by variation in field goal% for the team. rY22.1  0.2459. Holding constant the effect of field goal % for the team, 24.59% of the variation in number of wins can be explained by variation in three-point field goal %. Both field goal% and three-point field goal % for the team are useful in predicting the number of wins. From PHStat, Asking price vs living space and age

Y  450.2780  0.0969 X1  0.5151X 2 , where X1 = house size and X2 = age. Regression Analysis Regression Statistics Multiple R

0.6340

R Square

0.4019

Adjusted R Square

0.3813

Standard Error

88.6341

Observations

61

ANOVA df

SS

MS

F 19.4889

Regression

2

306209.3804

153104.6902

Residual

58

455648.6622

7856.0114

Total

60

761858.0426

Coefficients

Standard Error

t Stat

P-value

Copyright ©2024 Pearson Education, Inc.

Lower 95%

Upper 95%


Solutions to End-of-Section and Chapter Review Problems ccxxxix Intercept

450.2780

74.4900

6.0448

0.0000

301.1702

599.3859

Living Space

0.0969

0.0207

4.6903

0.0000

0.0555

0.1382

Age

-0.5151

0.8174

-0.6302

0.5311

-2.1513

1.1211

(b)

Holding constant the age, for each additional square foot in the size of the house, the mean asking price is estimated to increase by 0.0969 thousand dollars. Holding constant the living space of the house, for each additional year in age, the asking price is estimated to decrease by 0.5151 thousand dollars.

(c)

Y  450.2780  0.0969(2,000)  0.5151(55)  615.686 thousand dollars.

Copyright ©2024 Pearson Education, Inc.


ccxl Chapter 16: Time-Series Forecasting 14.76 cont.

(d)

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccxli 14.76 cont.

(d)

Based on a residual analysis, the model appears to be adequate. (e)

(f)

(g) (h) (i)

(j)

(k)

FSTAT = 19.4889, the p-value = 0.0000 < 0.05, reject H0. There is evidence of a significant relationship between asking price and the two independent variables (size of the house and age). The p-value is 0.0000. The probability of obtaining a test statistic of 19.4889 or greater is virtually 0 if there is no significant relationship between asking price and the two independent variables (living space of the house and age). r2 = 0.4019. 40.19% of the variation in asking price can be explained by variation in the size of the house and age. 2 radj = 0.3813. For X1: tSTAT = 4.6903, the p-value is 0.0000. Reject H0. The living space of the house makes a significant contribution and should be included in the model. For X2: tSTAT = –0.6302, p-value = 0.5311 > 0.05. Do not reject H0. Age does not make a significant contribution and should not be included in the model. Based on these results, the regression model with only the size of the house should be used. For X1: tSTAT = 4.6903. The probability of obtaining a sample that will yield a test statistic farther away than 4.6903 is 0.0000 if the living space does not make a significant contribution, holding age constant. For X2: tSTAT = –0.6302. The probability of obtaining a sample that will yield a test statistic farther away than 0.6302 is 0.5311 if the age does not make a significant contribution holding the effect of the living space constant. 0.0555  1  0.1382 You are 95% confident that the asking price will increase by an amount somewhere between $55.50 thousand and $138.20 thousand for each additional thousand square foot increase in living space, holding constant the age of the house. In Problem 13.76, you are 95% confident that the assessed value will increase by an amount somewhere between $71.0 thousand and $137.9 thousand for each additional 1,000 square foot increase in living space, regardless of the age of the house. Copyright ©2024 Pearson Education, Inc.


ccxlii Chapter 16: Time-Series Forecasting

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccxliii 14.76 cont.

(k)

From PHStat, Regression Analysis Coefficients of Partial Determination Coefficients

(l)

(m) (a) (b)

(c) (d)

0.274988878

r2 Y2.1

0.006799874

rY21.2 = 0.2750. Holding constant the effect of the age of the house, 27.50% of the variation in asking price can be explained by variation in the living space of the house. rY22.1 = 0.0068. Holding constant the effect of the size of the house, 0.68% of the variation in asking price can be explained by variation in the age of the house. Only the living space of the house should be used to predict asking price. Yˆ  62.1411  2.0567X 1  15.6418X 2 , where X1 = diameter of the tree at breast height of a person (in inches) and X2 = thickness of the bark (in inches). Holding constant the effects of the thickness of the bark, for each additional inch of increase in the diameter of the tree at breast height of a person, the height of the tree is estimated to increase by a mean of 2.0567 feet. Holding constant the effects of the diameter of the tree at breast height of a person, for each additional inch of increase in the thickness of the bark, the height of the tree is estimated to increase by a mean of 15.6418 feet. Yˆ  62.1411  2.0567  25  15.6418  2   144.84 feet.

r 2  0.7858 . So 78.58% of the total variation in the height of the tree can be explained by the variations of both the diameter of the tree at breast height of a person and the thickness of the bark of the tree.

(e) Diameter at breast height Residual Plot 80 60 40

Residuals

14.77

r2 Y1.2

20 0 -20 -40 -60 0

10

20

30

40

50

Diameter at breast height

Copyright ©2024 Pearson Education, Inc.

60


ccxliv Chapter 16: Time-Series Forecasting (e) Bark thickness Residual Plot 80 60 40

Residuals

14.77 cont.

20 0 -20 -40 -60 0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

Bark thickness

(f)

(g) (h) (i) (j)

(k)

(l)

The plot of the residuals against bark thickness indicates a potential pattern that may require the addition of nonlinear terms. One value appears to be an outlier in both plots. F = 33.0134 with 2 and 18 degrees of freedom. p-value = 9.49912E-07 < 0.05. Reject H0. At least one of the independent variables is linearly related to the dependent variable. 1.1264  1  2.9870 0.6238   2  30.6598 Since 0 is not included in both 95% confidence intervals in (g), both explanatory variables should be included in this model. 134.0091  Y | X  155.6760 96.1452  YX  193.5399

rY21.2  0.5452 . For a given bark thickness of the tree, 54.52% of the variation in height can be explained by variation in the diameter of the tree at the breast height of a person. rY22.1  0.2101 . For a given diameter of the tree at the breast height of a person, 21.01% of the variation in height can be explained by variation in bark thickness. None of the observations have a Cook’s Di > F  0.8194 with d.f. = 3 and 18. Hence, using the Studentized deleted residuals, hat matrix diagonal elements and Cook’s distance statistic together, there is insufficient evidence for removal of any observation from the model. Both the diameter of the tree and the thickness of the bark affect the height of the tree, but the diameter is more important.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccxlv 14.78

(a)

From PHStat, Taxes vs asking price and age

Y  99.0443  8.1105 X1  2.7558 X 2 , where X1 = asking price and X2 = age. Regression Analysis Regression Statistics Multiple R

0.9915

R Square

0.9830

Adjusted R Square

0.9824

Standard Error

119.7195

Observations

61

ANOVA df

SS

Regression

2

Residual

58

831300.5637

Total

60

48905968.8393

MS

F

48074668.2757 24037334.1378 1677.0894 14332.7683

Coefficients

Standard Error

P-value

Lower 95%

Upper 95%

t Stat

Intercept

-99.0443

124.4368

-0.7959

0.4293

-348.1315

150.0430

Asking Price

8.1105

0.1510

53.7060

0.0000

7.8082

8.4127

Age

2.7558

0.9896

2.7849

0.0072

0.7750

4.7366

(b)

Holding age constant, for each additional $1,000 in asking price, the taxes are estimated to increase by a mean of $8.1105 thousand. Holding asking price constant, for each additional year, the taxes are estimated to increase by $2.7558

(c)

Y  99.0443  8.1105(400)  2.7558(50)  $3,282.928

Copyright ©2024 Pearson Education, Inc.


ccxlvi Chapter 16: Time-Series Forecasting 14.78 cont.

(d)

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccxlvii 14.78 cont.

(d)

(e) (f)

(g) (h) (i)

(j)

(k)

Based on a residual analysis, the errors appear to be normally distributed. The equalvariance assumption appears to be valid. However, there is one very large residual that is from the house that is 107 years old. Removing this point, still leaves a residual for the house that has an asking price of $550,000 and is 52 years old. However, because this model is an almost perfect fit, you may want to use this model. In this model, age is no longer significant. FSTAT = 1,677.0894, p-value = 0.0000 < 0.05, reject H0. There is evidence of a significant relationship between taxes and the two independent variables (asking price and age). p-value = 0.0000. The probability of obtaining an FSTAT test statistic of 1,677.0894 or greater is virtually 0 if there is no significant relationship between taxes and the two independent variables (asking price and age). r2 = 0.9830, 98.30% of the variation in taxes can be explained by variation in asking price and age. 2 radj  0.9824 For X1: tSTAT = 53.7060, p-value = 0.0000 < 0.05. Reject H0. The asking price makes a significant contribution and should be included in the model. For X2: tSTAT = 2.7849, p-value = 0.0072 < 0.05. Reject H0. The age of a house makes a significant contribution and should be included in the model. Based on these results, the regression model with asking price and age should be used. For X1: p-value = 0.0000. The probability of obtaining a sample that will yield a test statistic greater than 53.7060 is 0.0000 if the asking price does not make a significant contribution, holding age constant. For X2: p-value = 0.0072. The probability of obtaining a sample that will yield a test statistic greater than 2.7849 is 0.0072 if the age of a house does not make a significant contribution, holding the effect of the asking price constant. 7.8082  1  8.4127. You are 95% confident that the mean taxes will increase by an amount somewhere between $7.81 and $8.41 for each additional $1,000 increase in the asking price, holding constant the age. In Problem 13.77, you are 95% confident that the mean taxes will increase by an amount somewhere between $7.6447 and $8.2242 for each additional $1,000 increase in asking price, regardless of the age. Copyright ©2024 Pearson Education, Inc.


ccxlviii Chapter 16: Time-Series Forecasting

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccxlix 14.78 cont.

(k)

From PHStat, Regression Analysis Coefficients of Partial Determination Coefficients 0.980287764

r2 Y2.1

0.117946139

(m)

rY21.2  0.9803. Holding constant the effect of age, 98.03% of the variation in taxes can be explained by variation in the asking price. rY22.1  0.1179. Holding constant the effect of the asking price, 11.79% of the variation in taxes can be explained by variation in the age. Based on your answers to (b) through (k), the age of a house has an effect on its taxes.

(a)

From PHStat, Wins vs ERA and runs per game

(l)

14.79

r2 Y1.2

Y  69.1966  16.9662 X1  15.3543 X 2 , where X1 = ERA and X2 = runs per game. Regression Analysis Regression Statistics Multiple R

0.9724

R Square

0.9455

Adjusted R Square

0.9415

Standard Error

3.5512

Observations

30

ANOVA df

SS

MS

Regression

2

5911.5086

2955.7543

Residual

27

340.4914

12.6108

Total

29

6252.0000

Intercept

Coefficients

Standard Error

t Stat

69.1966

11.9427

5.7940

F 234.3829

P-value

Lower 95%

Upper 95%

0.0000

44.6922

93.7011

Copyright ©2024 Pearson Education, Inc.


ccl Chapter 16: Time-Series Forecasting Runs per game

16.9662

1.8068

9.3902

0.0000

13.2590

20.6735

ERA

-15.3543

1.4332

-10.7133

0.0000

-18.2950

-12.4137

(b)

The Y intercept, b0, would be the mean number of wins when the ERA and runs per game are zero. This would not be meaningful in this case because a zero ERA and zero hits per game would not be reasonable. For each one unit increase ERA, one would estimate that the predicted mean number of wins would decrease by 16.9662, while holding hits per game constant. For each one unit increase in hits per game, the predicted mean number of wins would decrease by 15.3543, while holding ERA constant.

(c)

Y  69.1966  16.9662(4.50)  15.3543(4.6)  74.9147 wins.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccli 14.79 cont.

(d)

Copyright ©2024 Pearson Education, Inc.


cclii Chapter 16: Time-Series Forecasting 14.79 cont.

(d)

(e)

(f) (g) (h) (i)

(j)

(k)

There is no pattern in the relationship between residuals and the predicted value of Y, the value of ERA, or the value runs scored per game. The regression assumptions are satisfied. At the 0.05 significance level, there is evidence of a significant linear relationship between number of wins and the independent variables, ERA and runs scored per game. Because FSTAT = 234.3829 or p-value = 0.0000, reject H0. The p-value from (e) indicates that the probability of obtaining a FSTAT of 234.3829 or larger is 0.000 when the null hypothesis, β1 = β2 =0, is true. r2 = 0.9455. Thus, 94.55% of the variation in wins can be explained by the variation in ERA and the variation in hits per game. 2 = 0.9415. The adjusted r2 takes into account the number of independent variables and radj the sample size. At the 0.05 significance level, there is evidence of linear relationship between ERA and number of wins. For X1: because tSTAT = 9.3902 or p-value = 0.000, reject H0 and include ERA in the model. At the 0.05 significance level, there is evidence of linear relationship between runs scored per game and the number of wins. For X2: because tSTAT = –10.7133 or p-value = 0.0000, reject H0 and include runs scored per game in the model. Based on these results both ERA and runs scored per game should be included in the model. The p-value for the ERA independent variable X1, indicates that the probability of obtaining a tSTAT of 9.3902 or larger is 0.000 when the null hypothesis is true. The p-value for the runs scored per game independent variable X2, indicates that the probability of obtaining a tSTAT of –10.7133 or less is 0.000 when the null hypothesis is true. 13.2590 ≤ β1 ≤ 20.6735

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccliii 14.79 cont.

(k)

From PHStat, Regression Analysis Coefficients of Partial Determination Coefficients

(l)

(m)

14.80

(a)

r2 Y1.2

0.765575645

r2 Y2.1

0.809558207

rY21.2 = 0.7656. Holding the effect of runs scored per game constant, 76.56% of variation in number of wins can be explained by the variation in ERA. rY22.1 = 0.8096. Holding the effect of ERA constant, 80.96% of variation in number of wins can be explained by the variation in runs per game. Pitching contributes more to the number of wins because it can account for more variation in number of wins when holding runs per game constant compared to the amount of variation in number of wins that can be explained by runs per game when holding ERA constant. From PHStat, Wins vs ERA and league

Y  172.4619  23.5464 X1  3.7990 X 2 , where X1 = ERA and where X1 = ERA and X2 = league (American = 0 National = 1). Regression Analysis Regression Statistics Multiple R

0.8858

R Square

0.7846

Adjusted R Square

0.7686

Standard Error

7.0629

Observations

30

ANOVA df

SS

MS

Regression

2

4905.1172

2452.5586

Residual

27

1346.8828

49.8845

Total

29

6252.0000

Copyright ©2024 Pearson Education, Inc.

F 49.1647


ccliv Chapter 16: Time-Series Forecasting

Coefficients

Standard Error

P-value

Lower 95%

Upper 95%

t Stat

Intercept

172.4619

9.3894

18.3677

0.0000

153.1964

191.7273

ERA

-23.5464

2.3747

-9.9156

0.0000

-28.4188

-18.6739

League

3.7990

2.6114

1.4548

0.1573

-1.5591

9.1572

(b)

Holding constant the effect of the league, for each additional earned run, the number of wins is estimated to decrease by 23.5464. For a given ERA, a team in the National League is estimated to have 3.7990 more wins than a team in the American League.

(c)

Y  172.4619  23.5464(4.50)  3.7990(0)  66.503 wins.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cclv 14.80 cont.

(d)

Copyright ©2024 Pearson Education, Inc.


cclvi Chapter 16: Time-Series Forecasting 14.80 cont.

(d)

(e)

(f)

Based on a residual analysis, there is no pattern in the errors. There is no apparent violation of other assumptions. FSTAT = 49.1647 > 3.35, p-value = 0.0000 < 0.05, reject H0. There is evidence of a significant relationship between wins and the two independent variables (ERA and league). For X1: tSTAT = –9.9156 < –2.0518, the p-value = 0.0000. Reject H0. ERA makes a significant contribution and should be included in the model. For X2: tSTAT = 1.4548 < 2.0518, p-value = 0.1573 > 0.05. Do not reject H0. The league does not make a significant contribution and should not be included in the model.

(g) (h)

Based on these results, the regression model with only the ERA as the independent variable should be used. 28.4188  1  18.6739 1.5591  2  9.1572

(i)

2 radj  0.7686 76.86% of the variation in wins can be explained by the variation in ERA

and league after adjusting for number of independent variables and sample size. (j)

From PHStat, Regression Analysis Coefficients of Partial Determination Coefficients r2 Y1.2

0.78454931

r2 Y2.1

0.072687011

rY21.2  0.7845 Holding constant the effect of league, 78.45% of the variation in number of wins can be explained by the variation in ERA. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cclvii

rY22.1  0.0727 Holding constant the effect of ERA, 7.27% of the variation in number of wins can be explained by the variation in league. The slope of the number of wins with ERA is the same, regardless of whether the team belongs to the American League or the National League. For X1X2: tSTAT = –0.2083 > –2.0555 the p-value is 0.8366 > 0.05. Do not reject H0. There is no evidence that the interaction term makes a contribution to the model. The model with one independent variable (ERA) should be used.

(k) 14.80 cont.

(l) (m)

14.81

Model with interaction terms: Yˆ  983.4037  0.0249 X 1  5.8443 X 2  1.9300 X 3

 0.0000 X1 X 2  0.0205 X 1 X 3  0.4307 X 2 X 3 where X1 = House size, X 2 = Age, X 3 = 0 if Glen Cove, 1 if Merrick PHStat output: Regression Statistics Multiple R

0.7279

R Square

0.5298

Adjusted R Square

0.5106

Standard Error

271.5296

Observations

154

ANOVA df

SS

MS

F 27.6034

Regression

6

12210914.1616

2035152.3603

Residual

147

10838062.6760

73728.3175

Total

153

23048976.8377

Coefficients

Standard Error

t Stat

Intercept

983.4037

148.7121

6.6128

0.0000

House Size

0.0249

0.0059

4.2033

0.0000

Age

-5.8443

1.8211

-3.2092

0.0016

Location

-1.9300

158.0724

-0.0122

0.9903

Copyright ©2024 Pearson Education, Inc.

P-value


cclviii Chapter 16: Time-Series Forecasting Size*Age

0.0000

0.0001

-0.3661

0.7148

Size*Location

0.0205

0.0089

2.3108

0.0222

Age*Location

-0.4307

1.9446

-0.2215

0.8250

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cclix 14.81 cont.

Model without interaction terms: Yˆ  986.1017  0.0247 X 1  6.2045 X 2  113.0510 X 3 where X1 = House size, X 2 = Age, X 3 = 0 if Glen Cove, 1 if Merrick PHStat output: Regression Statistics Multiple R

0.7157

R Square

0.5122

Adjusted R Square

0.5024

Standard Error

273.7885

Observations

154

ANOVA df

SS

MS

F 52.4944

Regression

3

11804955.7623

3934985.2541

Residual

150

11244021.0753

74960.1405

Total

153

23048976.8377

Coefficients

Standard Error

t Stat

P-value

Intercept

986.1017

80.3543

12.2719

0.0000

House Size

0.0247

0.0025

9.8176

0.0000

Age

-6.2045

0.8707

-7.1260

0.0000

Location

133.0510

49.4102

2.6928

0.0079

Partial F test for the interaction effects: H 0 : 4  5  6  0 H1 : Not all  j = 0 for j = 4, 5, 6

 SSR  X 1 , X 2 , X 3 , X 4 , X 5 , X 6   SSR  X 1 , X 2 , X 3   / 3 FSTAT =  = 13.3268 with 3 numerator and 53 MSE  X 1 , X 2 , X 3 , X 4 , X 5 , X 6  denominator degrees of freedom. The p-value is 0.0000. At 5% level of significance, the interaction terms are significant together.

Copyright ©2024 Pearson Education, Inc.


cclx Chapter 16: Time-Series Forecasting 14.81 cont.

Individual t test of the slope parameters: H 0 :  j  0 H1 :  j  0 Using 5% level of significance, the interaction between land and age, and the interaction between age and the Glen Cove dummy variable are significant in explaining the variation of fair market value. Model with land, land and age interaction and land and Glen Cove dummy interaction: PHStat output: Regression Statistics Multiple R

0.7160

R Square

0.5127

Adjusted R Square

0.4962

Standard Error

275.4819

Observations

154

ANOVA df

SS

MS

F 31.1429

Regression

5

11817218.7218

2363443.7444

Residual

148

11231758.1158

75890.2575

Total

153

23048976.8377

Coefficients

Standard Error

t Stat

P-value

Intercept

944.6140

149.9124

6.3011

0.0000

House Size

0.0269

0.0059

4.5285

0.0000

Age

-5.6850

1.8463

-3.0791

0.0025

Location

156.2574

144.5545

1.0810

0.2815

Size*Age

0.0000

0.0001

-0.3997

0.6900

Age*Location

-0.2788

1.9718

-0.1414

0.8877

Yˆ  944.6140  0.0269 X 1  5.6850X 2  156.2574 X 3  0.0 X 1 X 2  0.2788 X 2 X 3 H 0 :  j  0 H1 :  j  0 Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cclxi All the slope parameters are significant individually at 5% level of significance. The final model should use land, age, Glen Cove dummy variable, land and age interaction, and age and Glen Cove dummy variable interaction. 14.82

The multiple regression model is Predicted base salary = 48,091.7853 + 8,249.2156 (gender) + 1,061.4521 (age). Holding constant the age of the person, the mean base salary is predicted to be $8,249.22 higher for males than for females. Holding constant the gender of the person, for each addition year of age, the mean base salary is predicted to be $1,061.45 higher. The regression model with the two independent variables has F = 118.0925 and a p-value = 0.0000. So, you can conclude that at least one of the independent variable makes a significant contribution to the model to predict base pay. Each independent variable makes a significant contribution to the regression model given that the other variable is included. (tSTAT = 3.9937, p-value = 0.0001 for gender and tSTAT = 14.8592, p-value = 0.0000 for age). Both independent variables should be included in the model. 37.01% of the variation in base salary can be explained by gender and age.

Copyright ©2024 Pearson Education, Inc.


cclxii Chapter 16: Time-Series Forecasting 14.82 cont.

There is no pattern in the residuals and no other violations of the assumptions, so the model appears to be appropriate. Including an interaction term of gender and age does not significantly improve the model (tSTAT –0.2371, p-value = 0.8127 > 0.05). You can conclude that females are paid less than males holding constant the age of the person. Perhaps other variables such as department, seniority, and score on a performance evaluation can be included in the model to see if the model is improved.

14.83

Excel output: Regression Statistics Multiple R 0.7520 R Square 0.5655 Adjusted R Square 0.4785 Standard Error 0.9136 Observations 19 ANOVA df Regression Residual Total

Intercept Viscosity Pressure Plate Gap

3 15 18

SS MS 16.2908 5.4303 12.5192 0.8346 28.8100

F Significance F 6.5063 0.0049

Coefficients Standard Error t Stat P-value -18.6915 7.9789 -2.3426 0.0334 0.0121 0.0082 1.4817 0.1591 0.0844 0.0414 2.0415 0.0592 0.5000 0.1379 3.6271 0.0025

Lower 95% Upper 95% -35.6982 -1.6848 -0.0053 0.0296 -0.0037 0.1726 0.2062 0.7938

The r 2 of the multiple regression is 0.5655. So 56.66% of the variation in tear rating can be explained by the variation of viscosity, pressure, and plate gap on the bag-sealing equipment. The F test statistic for the combined significant of viscosity, pressure, and plate gap on the bagsealing equipment is 6.5063 with a p-value of 0.0049. Hence, at a 5% level of significance, there is enough evidence to conclude that viscosity, pressure, and plate gap on the bag-sealing equipment affect tear rating. The p-value of the t test for the significance of viscosity is 0.1591, which is larger than 5%. Hence, there is not sufficient evidence to conclude that viscosity affects tear rating holding constant the effect of pressure and plate gap on the bag-sealing equipment. The p-value of the t test for the significance of pressure is 0.0592, which is also larger than 5%. There is not enough evidence to conclude that pressure affects tear rating at 5% level of significance holding constant the effect of viscosity and plate gap on the bag-sealing equipment. The p-value of the t test for the significance of plate gap is 0.0025, which is smaller than 5%. There is enough evidence to conclude that plate gap affects tear rating at 5% level of significance holding constant the effect of viscosity and pressure.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cclxiii Excel output after dropping viscosity and pressure: Regression Statistics Multiple R 0.6173 R Square 0.3811 Adjusted R Square 0.3447 Standard Error 1.0241 Observations 19 ANOVA df Regression Residual Total

SS MS F Significance F 10.9800 10.9800 10.4689 0.0049 17.8300 1.0488 28.8100

1 17 18

Coefficients Standard Error t Stat P-value 0.7500 0.2349 3.1922 0.0053 0.5000 0.1545 3.2356 0.0049

Intercept Plate Gap

Lower 95% Upper 95% 0.2543 1.2457 0.1740 0.8260

Plate gap still remains statistically significant at the 5% level of significance. Hence, only plate gap on the bag-sealing equipment need to be used in the model. Residual Plot

Residuals

3 2.5 2 1.5 1 0.5 0 -0.5 -1 -1.5

-4

-2

0 X

2

4

The residual plot suggests that the equal variance assumption is likely violated. Normal Probability Plot

Residuals

14.83 cont.

Boxplot

3 2.5 2 1.5 1 0.5 0 -0.5 -1 -1.5

Residuals

-2

-1

0 Z Value

1

2 -10

-5

0

The normal probability plot and the boxplot both suggest that the normal distribution assumption is also likely violated.

Copyright ©2024 Pearson Education, Inc.


cclxiv Chapter 16: Time-Series Forecasting 14.84

b0 = 18.2892 (die temperature), b1 = 0.5976, (die diameter), b2 = –13.5108. The r2 of the multiple regression model is 0.3257 so 32.57% of the variation in unit density can be explained by the variation of die temperature and die diameter. The F test statistic for the combined significance of die temperature and die diameter is 5.0718 with a p-value of 0.0160. Hence, at a 5% level of significance, there is enough evidence to conclude that die temperature and die diameter affect unit density. The p-value of the t test for the significance of die temperature is 0.2117, which is greater than 5%. Hence, there is insufficient evidence to conclude that die temperature affects unit density holding constant the effect of die diameter. The p-value of the t test for the significance of die diameter is 0.0083, which is less than 5%. There is enough evidence to conclude that die diameter affects unit density at the 5% level of significance holding constant the effect of die temperature. After removing die temperature from the model, b0 = 107.9267 (die diameter), b1 = –13.5108. The r2 of the multiple regression is 0.2724. So 27.24% of the variation in unit density can be explained by the variation of die diameter. The p-value of the t test for the significance of die diameter is 0.0087, which is less than 5%. There is enough evidence to conclude that die diameter affects unit density at the 5% level of significance. There is some lack of equality in the residuals and some departure from normality.

14.85

Excel output: Regression Statistics Multiple R 0.3101 R Square 0.0961 Adjusted R Square 0.0101 Standard Error 1.4439 Observations 24 ANOVA df Regression Residual Total

Intercept Die Temperature Die Diameter

SS 2 21 23

MS 4.6572 2.3286 43.7810 2.0848 48.4382

F Significance F 1.1169 0.3460

Coefficients Standard Error t Stat P-value 1.6308 9.0843 0.1795 0.8592 0.0681 0.0589 1.1550 0.2611 -0.5592 0.5895 -0.9486 0.3536

Lower 95% Upper 95% -17.2609 20.5226 -0.0545 0.1907 -1.7850 0.6667

The r 2 of the multiple regression is 0.0961. So 9.61% of the variation in foam diameter can be explained by the variation of die temperature and die diameter. The F test statistic for the combined significant of die temperature and die diameter is 1.1169 with a p-value of 0.3460. Hence, at a 5% level of significance, there is not enough evidence to conclude that die temperature and die diameter affect foam diameter. The p-value of the t test for the significance of die temperature is 0.2611, which is larger than 5%. Hence, there is not sufficient evidence to conclude that die temperature affects foam diameter holding constant the effect of die diameter. The p-value of the t test for the significance of die diameter is 0.3536, which is also larger than 5%. There is not enough evidence to conclude that die diameter affects foam diameter at 5% level of significance holding constant the effect of die temperature. None of the two independent variables should be kept in the model. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cclxv Chapter 15

15.1

Multicollinearity is observed when the existence of such a high degree of correlation between supposedly independent variables being used to estimate a dependent variable that the contribution of each independent variable to variation in the dependent variable cannot be determined.

15.2

VIF 

1  3.33 1  0.7

15.3

VIF 

1  1.25 1  0.2

15.4

From PHStat Regression Analysis

Regression Analysis

Efficiency Ratio and all other X

Risk-Based Capital and all other X

Regression Statistics

Regression Statistics

Multiple R

0.0184

Multiple R

0.0184

R Square

0.0003

R Square

0.0003

Adjusted R Square

-0.0047

Adjusted R Square

-0.0047

Standard Error

8.4810

Standard Error

24.4994

Observations

200

Observations

200

VIF

1.0003

VIF

1.0003

1 1  1.0003  1.0003 R22  0.0003 , VIF2  1  0.0003 1  0.0003 There is no evidence of collinearity because both VIFs are < 5.

R12  0.0003 , VIF1 

15.5

1 1.0565 1 0.0535 There is no reason to suspect the existence of collinearity. VIF 

Copyright ©2024 Pearson Education, Inc.


cclxvi Chapter 16: Time-Series Forecasting

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cclxvii 15.6

From PHStat Regression Analysis

Regression Analysis

Worldwide Revenues and all other X

Number of New Graduates Hired and all other X

Regression Statistics

Regression Statistics

Multiple R

0.3157

Multiple R

0.3157

R Square

0.0997

R Square

0.0997

Adjusted R Square

0.0875

Adjusted R Square

0.0875

Standard Error

29.9736

Standard Error

1294.3388

Observations

76

Observations

76

VIF

1.1107

VIF

1.1107

1 1  1.1107  1.1107 R22  0.0997 , VIF2  1  0.0997 1  0.0997 There is no evidence of collinearity because both VIFs are < 5.

R12  0.0997 , VIF1 

15.7

1  1.169 1  0.1444 1  1.169 R22  0.1444 , VIF2  1  0.1444 There is no reason to suspect the existence of collinearity.

R12  0.1444 , VIF1 

15.8

From PHStat Regression Analysis

Regression Analysis

House Size and all other X

Age and all other X

Regression Statistics

Regression Statistics

Multiple R

0.1282

Multiple R

0.1282

R Square

0.0164

R Square

0.0164

Copyright ©2024 Pearson Education, Inc.


cclxviii Chapter 16: Time-Series Forecasting Adjusted R Square

0.0000

Adjusted R Square

0.0000

Standard Error

13312.5581

Standard Error

29.2630

Observations

62

Observations

62

VIF

1.0167

VIF

1  1.0167 1  0.0164 There is no evidence of collinearity. VIF 

Copyright ©2024 Pearson Education, Inc.

1.0167


Solutions to End-of-Section and Chapter Review Problems cclxix 15.9

The feature selection is selected from the list of candidate variables.

15.10

The principle of parsimony allows you to choose the minimum number of independent variables to include in a model.

15.11

(a)

For the model that includes independent variables A and B, the value of Cp exceeds 3, the number of parameters, so this model does not meet the criterion for further consideration. For the model that includes independent variables A and C, the value of Cp is less than or equal to 3, the number of parameters, so this model does meet the criterion for further consideration. For the model that includes independent variables A, B, and C, the value of Cp is exceeds 4, the number of parameters, so this model does not meet the criterion for further consideration. The inclusion of variable C in the model does not appear to improve the model’s ability to explain variation in the dependent variable sufficiently to justify its inclusion in a model that contains only variables A and B.

15.12

Stepwise regression uses the t or F statistics to determine whether a variable should be entered into or deleted from a model, while best subsets uses the statistic to determine the best models to consider. Stepwise regression attempts to find the best regression model without examining all possible regressions by adding and subtracting X variables at each step of the process. Bestsubsets regression examines each possible regression model and uses the Cp statistic to determine which models can be considered to be good fitting models.

Copyright ©2024 Pearson Education, Inc.


cclxx Chapter 16: Time-Series Forecasting 15.13

From PHStat, after Multiple Regression Analysis, including VIF, where Y = mean starting salary, X1 = tuition, X2 = average GMAT score, X3 = acceptance percentage, X4 = percent with job offers Regression Analysis Regression Statistics Multiple R

0.9145

R Square

0.8364

Adjusted R Square

0.8145

Standard Error

14920.0070

Observations

35

ANOVA df

SS

MS

F

Regression

4 34131388662.4270 8532847165.6067 38.3315

Residual

30

Total

34 40809586932.6857

6678198270.2588

222606609.0086

Coefficients

Standard Error

t Stat

P-value

44742.2327

40286.3146

1.1106

0.2756

Per-Year Tuition

0.5324

0.2417

2.2030

0.0354

0.4823

1.9316

Average GMAT Score

30.6930

59.8615

0.5127

0.6119

0.5586

2.2653

Acceptance Percentage

-547.3644

190.2569

-2.8770

0.0073

0.5942

2.4643

Graduates Employed at Graduation Pctage

885.9678

202.7716

4.3693

0.0001

0.4980

1.9920

Intercept

R Square

VIF

Based on a full regression model involving all of the variables, all the VIF values (1.9316, 2.2653, 2.4643, 1.9920, respectively) are less than 5. There is no reason to suspect the existence of collinearity.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cclxxi 15.13 cont.

From PHStat, Best Subsets Analysis, X1 = tuition, X2 = average GMAT score, X3 = acceptance percentage, X4 = percent with job offers Best Subsets Analysis Intermediate Calculations R2T

0.836357

1 - R2T

0.163643

n

35

T

5

n-T

30

Model

Cp

k+1

R Square

Adj. R Square

Std. Error

X1

67.3226

2

0.4637

0.4474

25753.6529

X2

62.7813

2

0.4884

0.4729

25151.8681

X3

30.8211

2

0.6628

0.6526

20421.1629

X4

30.6912

2

0.6635

0.6533

20399.6963

X1X2

47.2291

3

0.5842

0.5582

23027.9093

X1X3

24.2435

3

0.7096

0.6914

19245.4138

X1X4

10.7179

3

0.7833

0.7698

16622.1443

X2X3

23.9291

3

0.7113

0.6932

19188.5066

X2X4

21.3111

3

0.7256

0.7084

18707.9312

X3X4

7.8969

3

0.7987

0.7862

16020.9679

X1X2X3

22.0907

4

0.7322

0.7063

18775.3386

X1X2X4

11.2770

4

0.7912

0.7710

16578.9516

X1X3X4

3.2629

4

0.8349

0.8189

14741.5590

X2X3X4

7.8533

4

0.8099

0.7915

15820.1386

X1X2X3X4

5.0000

5

0.8364

0.8145

14920.0070

Copyright ©2024 Pearson Education, Inc.


cclxxii Chapter 16: Time-Series Forecasting Based on a best subsets regression and examination of the resulting Cp values, the best model appears to be a model with variables X1, X3 and X4, which has Cp = 3.2629. Models that add other variables do not change the results very much.

15.13 cont.

The residual plots reveal no clear pattern, indicating no evidence for violation of the equal variance or linearity assumptions. The normal probability plot indicates a slight deviation from normality.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cclxxiii

15.13 cont.

From PHStat, Stepwise Regression Analysis, X1 = tuition, X2 = average GMAT score, X3 = acceptance percentage, X4 = percent with job offers

Stepwise Regression Analysis Table of Results for General Stepwise

Graduates Employed at Graduation Pctage entered.

df

SS

MS

F

Regression

1 27076715818.2036 27076715818.2036 65.0652

Residual

33 13732871114.4821

416147609.5298

Copyright ©2024 Pearson Education, Inc.


cclxxiv Chapter 16: Time-Series Forecasting Total

34 40809586932.6857

Coefficients

Standard Error

t Stat

P-value

Intercept

26392.8575

14562.0657

1.8124

0.0790

Graduates Employed at Graduation Pctage

1584.5164

196.4366

8.0663

0.0000

Acceptance Percentage entered.

df

SS

MS

F

Regression

2 32596101726.9610 16298050863.4805 63.4977

Residual

32

Total

34 40809586932.6857

Coefficients Intercept

8213485205.7247

256671412.6789

Standard Error

t Stat

P-value

102703.4711

20039.8461

5.1250

0.0000

Graduates Employed at Graduation Pctage

955.2549

205.4603

4.6493

0.0001

Acceptance Percentage

-803.7266

173.3212

-4.6372

0.0001

Per-Year Tuition entered.

df

SS

MS

F

Regression

3 34072866482.0294 11357622160.6765 52.2638

Residual

31

Total

34 40809586932.6857

6736720450.6563

217313562.9244

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cclxxv

Copyright ©2024 Pearson Education, Inc.


cclxxvi Chapter 16: Time-Series Forecasting 15.13 cont.

From PHStat, Stepwise Regression Analysis, X1 = tuition, X2 = average GMAT score, X3 = acceptance percentage, X4 = percent with job offers Coefficients

Standard Error

61063.9936

Graduates Employed at Graduation Pctage Acceptance Percentage

Intercept

Per-Year Tuition

t Stat

P-value

24395.8895

2.5030

0.0178

919.6437

189.5455

4.8518

0.0000

-569.6082

183.0291

-3.1121

0.0040

0.5782

0.2218

2.6068

0.0139

No other variables could be entered into the model. Stepwise ends.

Based on a stepwise regression analysis with all the original variables, only X1, X3 and X4 make a significant contribution to the model at the 0.05 level. Thus, the best model is the model using the tuition (X1), acceptance percentage (X3), and percent with job offers (X4) should be included in the model. 15.14

From PHStat, after Multiple Regression Analysis, including VIF, where Y = asking price, X1 = lot size, X2 = living space, X3 = number of bedrooms, X4 = number of bathrooms, X5 =age, X6 = fireplace (0 = No, 1 = Yes). Regression Analysis Regression Statistics Multiple R

0.6970

R Square

0.4859

Adjusted R Square

0.4287

Standard Error

85.1687

Observations

61

ANOVA df

SS

MS

F 8.5050

Regression

6

370157.9011

61692.9835

Residual

54

391700.1416

7253.7063

Total

60

761858.0426

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cclxxvii

Coefficients

Standard Error

t Stat

P-value

R Square

VIF

Intercept

372.6087

95.1280

3.9169

0.0003

Lot Size

69.3349

81.5135

0.8506

0.3988

0.2833

1.3953

Living Space

0.0740

0.0235

3.1459

0.0027

0.5278

2.1175

Bedrooms

8.8378

17.2349

0.5128

0.6102

0.5210

2.0878

Bathrooms

3.1427

22.1538

0.1419

0.8877

0.5751

2.3537

Age

-0.3351

0.8532

-0.3927

0.6961

0.4384

1.7807

Fireplace

66.8626

25.0072

2.6737

0.0099

0.0858

1.0939

Based on a full regression model involving all of the variables, all the VIF values (1.3953, 2.1175, 2.0878, 2.3537, 1.7807, 1.0539, respectively) are less than 5. There is no reason to suspect the existence of collinearity.

Copyright ©2024 Pearson Education, Inc.


cclxxviii Chapter 16: Time-Series Forecasting 15.14 cont.

From PHStat, Best Subsets Analysis (partial display), X1 = lot size, X2 = living space, X3 = number of bedrooms, X4 = number of bathrooms, X5 =age, X6 = fireplace Best Subsets Analysis Intermediate Calculations R2T

0.485862

1 - R2T

0.514138

n

61

T

7

n-T

54

Model

Cp

k+1

R Square

Adj. R Square

Std. Error

X1

28.9997

2

0.1812

0.1673

102.8259

X2

6.2460

2

0.3978

0.3876

88.1801

X3

32.4034

2

0.1488

0.1344

104.8410

X4

31.0924

2

0.1613

0.1471

104.0694

X5

29.6414

2

0.1751

0.1611

103.2088

X6

32.6028

2

0.1469

0.1324

104.9578

X1X2

6.3800

3

0.4156

0.3954

87.6152

X1X3

24.0514

3

0.2473

0.2214

99.4307

X1X4

23.8039

3

0.2497

0.2238

99.2750

X1X5

23.4593

3

0.2530

0.2272

99.0577

X1X6

21.3329

3

0.2732

0.2482

97.7062

X2X3

8.0988

3

0.3992

0.3785

88.8334

X2X4

8.0059

3

0.4001

0.3794

88.7680

X2X5

7.8160

3

0.4019

0.3813

88.6341

X2X6

0.8703

3

0.4681

0.4497

83.5904

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cclxxix Based on a best subsets regression and examination of the resulting Cp values, the best model appears to be a model with variables X2 and X6, which has Cp = 0.8703. Models that add other variables do not change the results very much.

Copyright ©2024 Pearson Education, Inc.


cclxxx Chapter 16: Time-Series Forecasting 15.14 cont.

Residuals

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cclxxxi 15.14 cont.

From PHStat, Stepwise Regression Analysis, X1 = lot size, X2 = living space, X3 = number of bedrooms, X4 = number of bathrooms, X5 =age, X6 = fireplace Stepwise Regression Analysis Table of Results for General Stepwise

Living Space entered.

df

SS

MS

F

Regression

1

303089.8144 303089.8144 38.9789

Residual

59

458768.2282

Total

60

761858.0426

Coefficients

Standard Error

Intercept

408.2614

Living Space

7775.7327

t Stat

P-value

33.0407

12.3563

0.0000

0.1044

0.0167

6.2433

0.0000

df

SS

MS

Fireplace entered.

F

Regression

2

356591.2010 178295.6005 25.5169

Residual

58

405266.8416

Total

60

761858.0426

Coefficients

Standard Error

Intercept

377.8322

Living Space

0.0957

6987.3593

t Stat

P-value

33.1954

11.3821

0.0000

0.0162

5.9176

0.0000

Copyright ©2024 Pearson Education, Inc.


cclxxxii Chapter 16: Time-Series Forecasting Fireplace

66.2138

23.9289

2.7671

0.0076

No other variables could be entered into the model. Stepwise ends. Based on a stepwise regression analysis with all the original variables, only X2 and X6 make a significant contribution to the model at the 0.05 level. Thus, the best model is the model using the living area of the house, X2 and fireplace, X6 should be included in the model. This was the model developed in Section 14.6.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cclxxxiii 15.15

From PHStat, after Multiple Regression Analysis, including VIF, where Y = revenue, X1 = number of partners, X2 = number of offices, X3 = Southeast Region (1 = Yes, 0 = No), X4 = Gulf Coast Region (1 = Yes, 0 = No). Regression Analysis Regression Statistics Multiple R

0.9148

R Square

0.8369

Adjusted R Square

0.8259

Standard Error

29.1776

Observations

64

ANOVA df

SS

MS

Regression

4

257811.6969

Residual

59

50228.7157

Total

63

308040.4126

F

64452.9242 75.7081

Coefficients

Standard Error

Intercept

4.7384

Number of Partners

851.3342

t Stat

P-value

R Square

VIF

6.9717

0.6797

0.4994

1.3007

0.1463

8.8900

0.0000

0.7267

3.6584

Number of Offices

0.2458

1.2073

0.2036

0.8394

0.7278

3.6743

Southeast Region

4.5279

9.1767

0.4934

0.6236

0.2998

1.4281

Gulf Coast Region

-5.1724

9.2033

-0.5620

0.5762

0.3038

1.4364

Based on a full regression model involving all of the variables, all the VIF values (3.6584, 6.6743, 1.4281, and 1.4364, respectively) are less than 5. There is no reason to suspect the existence of collinearity.

Copyright ©2024 Pearson Education, Inc.


cclxxxiv Chapter 16: Time-Series Forecasting 15.15 cont.

From PHStat, Best Subsets Analysis (partial display), X1 = number of partners, X2 = number of offices, X3 = Southeast Region, X4 = Gulf Coast Region. Best Subsets Analysis Intermediate Calculations R2T

0.836941

1 - R2T

0.163059

n

64

T

5

n-T

59

Model

Cp

R Square

k+1

Adj. R Square

Std. Error

X1

0.1917

2

0.8336

0.8310

28.7490

X2

83.8278

2

0.6025

0.5961

44.4402

X3

292.1319

2

0.0268

0.0111

69.5355

X4

301.8157

2

0.0000

-0.0161

70.4852

X1X2

2.1899

3

0.8337

0.8282

28.9832

X1X3

1.3327

3

0.8360

0.8306

28.7761

X1X4

1.2836

3

0.8362

0.8308

28.7642

X2X3

80.7616

3

0.6165

0.6039

44.0068

X2X4

82.0991

3

0.6128

0.6001

44.2184

X3X4

289.8578

3

0.0386

0.0071

69.6764

X1X2X3

3.3159

4

0.8361

0.8279

29.0108

X1X2X4

3.2435

4

0.8363

0.8281

28.9931

X1X3X4

3.0415

4

0.8368

0.8287

28.9436

X2X3X4

82.0321

4

0.6185

0.5994

44.2552

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cclxxxv X1X2X3X4

5.0000

5

0.8369

0.8259

29.1776

Based on a best subsets regression and examination of the resulting Cp values, the best model appears to be a model with only variable X1, which has Cp = 0.1917. Models that add other variables do not change the results very much.

Copyright ©2024 Pearson Education, Inc.


cclxxxvi Chapter 16: Time-Series Forecasting 15.15 cont.

Residuals

The residual plot versus the number of partners reveals some evidence for potential deviation from the equal variance assumption. The normal probability plot reveals evidence of potential departure from the normality assumption.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cclxxxvii 15.15 cont.

From PHStat, Stepwise Regression Analysis, X1 = number of partners, X2 = number of offices, X3 = Southeast Region, X4 = Gulf Coast Region. Stepwise Regression Analysis Table of Results for General Stepwise

Number of Partners entered.

df

SS

MS

F

Regression

1

256797.1392 256797.1392 310.7027

Residual

62

51243.2734

Total

63

308040.4126

Coefficients

Standard Error

826.5044

t Stat

P-value

Intercept

4.8429

4.5358

1.0677

0.2898

Number of Partners

1.3285

0.0754

17.6268

0.0000

No other variables could be entered into the model. Stepwise ends. Based on a stepwise regression analysis with all the original variables, only X1 makes a significant contribution to the model at the 0.05 level. Thus, the best model is the model using the number of partners, X1 should be included in the model. 15.16

Leave one out cross-validation (LOOCV) divides the data into the number of parts that equals the samples size, and is best used for smaller data sets, not for a sample of n = 10,000.

15.17

An overfit model is one that is too specific to a sample to be of best use for all possible samples.

15.18

The holdout method divides the data into two parts, the training and test sets, and then holds out the test set in the initial analysis.

15.19

Answers may vary.

15.20

Answers may vary.

15.21

Answers may vary. Copyright ©2024 Pearson Education, Inc.


cclxxxviii Chapter 16: Time-Series Forecasting 15.22

Estimated Probability of Success =

15.23

(a)

Estimated Odds Ratio 0.75   0.4286 (1  Estimated Odds Ratio) (1  0.75)

Holding constant the effects of X2, for each additional unit of X1 the natural logarithm of the odds ratio is estimated to increase by a mean of 0.5. Holding constant the effects of X1, for each additional unit of X2 the natural logarithm of the odds ratio is estimated to increase by a mean of 0.2.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cclxxxix 15.23 cont.

(b)

(c)

15.24

(a)

(b)

(c)

(d)

15.25

(a)

(b)

ln(estimated odds ratio) = 0.1 + 0.5 X1 + 0.2 X2 = 0.1 + 0.5(2) + 0.2(1.5) = 1.4 Estimated odds ratio = e1.4 = 4.055. The estimated odds of ―success‖ to failure are 4.055 to 1. Estimated Probability of the Event of Interest Estimated Odds Ratio 4.055 =   0.8022 (1  Estimated Odds Ratio) (1  4.055) ln(estimated odds ratio)= –6.94 + 0.13947X1 + 2.774X2 = –6.94 + 0.13947(36) + 2.774(0) = –1.91908 Estimated odds ratio = e 1.91908 = 0.1467 Estimated Probability of the Event of Interest Estimated Odds Ratio 0.1467 =   0.1280 (1  Estimated Odds Ratio) (1  0.1467) From the text discussion of the example, 70.16% of the individuals who charge $36,000 per annum and possess additional cards can be expected to purchase the premium card. Only 12.80% of the individuals who charge $36,000 per annum and do not possess additional cards can be expected to purchase the premium card. For a given amount of money charged per annum, the likelihood of purchasing a premium card is substantially higher among individuals who already possess additional cards than for those who do not possess additional cards. ln(estimated odds ratio) = –6.94 + 0.13947X1 + 2.774X2 = –6.94 + 0.13947(18) + 2.774(0) = –4.42954 Estimated odds ratio = e 4.42954 = 0.0119 Estimated Probability of the Event of Interest Estimated Odds Ratio 0.0119 =   0.01178 (1  Estimated Odds Ratio) (1  0.0119) Among individuals who do not purchase additional cards, the likelihood of purchasing a premium card diminishes dramatically with a substantial decrease in the amount charged per annum. Let X1 = distance traveled to rehabilitation in km, X2 = whether the person had a car (0 = no, 1 = yes), X3 = age of the person in years. ln(estimated odds) = 5.7765  0.0675 X1  1.9369 X 2  0.0599 X 3 ln(estimated odds) =  5.7765  0.0675  20   1.9369 1  0.0599  65  = 2.4699 Estimated odds ratio = e 2.4699 = 11.8213 Estimated Probability of the Event of Interest =

(c)

Estimated Odds Ratio  0.9220 (1  Estimated Odds Ratio)

ln(estimated odds) =  5.7765  0.0675  20   1.9369  0   0.0599  65 = 0.533 Estimated odds ratio = e 0.533 = 1.7040 Estimated Odds Ratio  0.6301 (1  Estimated Odds Ratio) Holding everything else the same, a person who has a car has a much higher probability of participating in rehabilitation.

Estimated Probability of the Event of Interest = (d)

Copyright ©2024 Pearson Education, Inc.


ccxc Chapter 16: Time-Series Forecasting 15.25 cont.

(e)

(f)

15.26

X1 : test statistic = –6.113. p-value = 0.0000 < 0.05. Reject H0. There is sufficient evidence to conclude that the distance traveled makes a significant contribution to the model. X 2 : test statistic = 7.121. p-value = 0.0000 < 0.05. Reject H0. There is sufficient evidence to conclude that whether a person has a car makes a significant contribution to the model. X 3 : test statistic = –5.037. p-value = 0.0000 < 0.05. Reject H0. There is sufficient evidence to conclude that the age of a person makes a significant contribution to the model. Holding everything else constant, the farther the distance traveled to rehabilitation, the less likely is a person to participation in rehabilitation. Holding everything else constant, a person who has a car is more likely to participation in rehabilitation. Holding everything else constant, the older the age is, the less likely is a person to participation in rehabilitation.

(a) Binary Logistic Regression

(b)

(c)

(d) (e)

(f)

Predictor Intercept fixed acidity chlorides pH

Coefficients SE Coef Z -47.4821 12.0173 -3.9512 1.310179398 0.4139 3.1656 90.57937563 22.643 4.0003 9.779258829 2.9743 3.288

Deviance

54.45564087

p -Value 0.0001 0.0015 0.0001 0.0010

Holding constant the effects of chlorides and pH, for each increase of one unit of fixed acidity, ln(odds) increases by an estimate of 1.3102. Holding constant the effects of fixed acidity and pH, for each increase of one unit in chlorides, ln(odds) increases by an estimate of 90.5794. Holding constant the effects of fixed acidity and chlorides, for each increase of one unit in pH, ln(odds) increases by an estimate of 9.7793. ln(estimated odds ratio) = 47.4821  1.3102 X1  90.5794 X 2  9.7793X 3 = –0.4603 Estimated odds ratio = e 0.4603 = 0.6311 Estimated Probability of the Event of Interest Estimated Odds Ratio 0.6311 =   0.3869 (1  Estimated Odds Ratio) (1  0.6311) The deviance statistic is 54.4556, which has a p-value of 0.9998. Do not reject H0. The model is a good fitting model. For fixed acidity: ZSTAT = 3.1656 with a p-value = 0.0015. Reject H0. There is sufficient evidence that fixed acidity makes a significant contribution to the model. For chlorides: ZSTAT = 4.0003 with a p-value = 0.0001. Reject H0. There is sufficient evidence that the amount of chlorides makes a significant contribution to the model. For pH: ZSTAT = 3.2880 with a p-value = 0.0010. Reject H0. There is sufficient evidence that pH makes a significant contribution to the model. Based on the p-values corresponding to the Z-values for the variable coefficients in the logistic regression equation and corresponding to the deviance statistics, the model that Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccxci

15.27

(a)

(b)

(c)

(d)

(e)

(f)

15.28

includes fixed acidity, chlorides and pH should be used to predict whether the wine is red. Let X1 = price of the pizza. Using PHStat, ln(estimated odds) = 1.243 –0.25034 X1 For X1: Z = –2.68 < –1.96. Reject H0. There is sufficient evidence that price of the pizza makes a significant contribution to the model. Let X1 = price of the pizza, X2 = status. Using PHStat, ln(estimated odds) = 1.220 –0.25019 X1 + 0.0377 X2 For X1: ZSTAT = –2.68 < –1.96. Reject H0. There is sufficient evidence that price of the pizza makes a significant contribution to the model. For X2: ZSTAT = 0.10 < 1.96. Do not reject H0. There is not sufficient evidence to conclude that status makes a significant contribution to the model. Model (a): Deviance statistic = 0.258. p-value = 0.998 > 0.05. Do not reject H0. There is insufficient evidence to conclude that model (a) is not a good fit. Model (b): Deviance statistic = 7.804. p-value = 0.731 > 0.05. Do not reject H0. There is insufficient evidence to conclude that model (b) is not a good fit. However, the Z test in (b) suggests that there is not sufficient evidence to conclude that status makes a significant contribution to the model. Using the parsimony principle, the model in (a) is preferred to the model in (b). ln(estimated odds ratio) = 1.243 –0.25034 X1 = 1.243 –0.25034 (8.99) = –1.0076 Estimated odds ratio = e 1.0076 = 0.3651 Estimated Probability of the Event of Interest Estimated Odds Ratio 0.3651 =   0.2675 (1  Estimated Odds Ratio) (1  0.3651) ln(estimated odds ratio) = 1.243 –0.25034 X1 = 1.243 –0.25034 (11.49) = –1.6334 Estimated odds ratio = e 1.6334 = 0.1953 Estimated Probability of the Event of Interest Estimated Odds Ratio 0.1953 =   0.1634 (1  Estimated Odds Ratio) (1  0.1953) ln(estimated odds ratio) = 1.243 –0.25034 X1 = 1.243 –0.25034 (13.99) = –2.2593 Estimated odds ratio = e 2.2593 = 0.1044 Estimated Probability of the Event of Interest Estimated Odds Ratio 0.1044 =  (1  Estimated Odds Ratio) (1  0.1044) = 0.0946 estimated odds ratio / (1 + estimated odds ratio) = 0.1044/(1 + 0.1044) = 0.0946

(a) Predictor

Coefficients SE Coef Z Intercept -0.6048 0.4194 -1.4421 claims/year 0.093769442 0.5029 0.1865 new business (1=yes, 0=no):1 1.810770296 0.8134 2.2261 Deviance

(b)

119.4353239

p-value

p -Value 0.1493 0.8521 0.0260 0.0457

Holding constant the effects of whether the policy is new, for each increase of the number of claims submitted per year by the policy holder, ln(odds) increases by an estimate of 0.0938. Holding constant the number of claims submitted per year by the policy holder, Copyright ©2024 Pearson Education, Inc.


ccxcii Chapter 16: Time-Series Forecasting

15.28

(c)

ln(odds) is estimated to be 1.8108 higher when the policy is new as compared to when the policy is not new. ln(estimated odds ratio) = 0.6048  0.0938(1)  1.8108(1) = 1.2998 Estimated odds ratio = e1.2998 = 3.6684

cont.

Estimated Probability of the Event of Interest = (d) (e)

(f)

(g)

The deviance statistic is 119.4353 with a  2 distribution of 95 d.f. and p-value = 0.0457 < 0.05. Reject H0. The model is not a good fitting model. For claims/year: ZSTAT = 0.1865, p-value = 0.8521> 0.05. Do not eject H0. There is not sufficient evidence that the number of claims submitted per year by the policy holder makes a significant contribution to the logistic model. For new business: ZSTAT = 2.2261, p-value = 0.0260 < 0.05. Reject H0. There is sufficient evidence that whether the policy is new makes a significant contribution to the logistic model. PHStat output: Predictor Intercept claims/year

Coefficients SE Coef Z -1.0125 0.3888 -2.6042 0.992742206 0.3367 2.9481

Deviance

125.0102452

Deviance

15.29

p-value

p -Value 0.0092 0.0032 0.0250

PHStat output: Predictor Coefficients SE Coef Z Intercept -0.5423 0.2515 -2.1563 new business (1=yes, 1.928618927 0=no):10.5211 3.7008

(h)

Estimated Odds Ratio  0.7858 (1  Estimated Odds Ratio)

119.4701921

p-value

p -Value 0.0311 0.0002 0.0526

The deviance statistic for (f) is 125.0102 with a  2 distribution of 96 d.f. and p-value = 0.0250 < 0.05. Reject H0. The model is not a good fitting model. The deviance statistic for (g) is 119.4702 with a  2 distribution of 96 d.f. and p-value = 0.0526 > 0.05. Do not reject H0. The model is a good fitting model. The model in (g) should be used to predict a fraudulent claim.

(a) Binary Logistic Regression Predictor Intercept calls visits

Coefficients SE Coef Z -1.6023 0.9884 -1.6211 0.061953028 0.0291 2.1316 0.094175407 0.5214 0.1806

Deviance

30.13249138

p -Value 0.1050 0.0330 0.8567

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccxciii 15.29 cont.

(b)

(c)

Holding constant the effects of the number of visits the customer makes to the local service center, for each increase of the number of calls the customer makes to the company call center, ln(odds) increases by an estimate of 0.0620. Holding constant the number of calls the customer makes to the company call center, for each increase of the number of visits the customer makes to the local service center, ln(odds) increases by an estimate of 0.0942. ln(estimated odds ratio) = 1.6023  0.0620(10)  0.0942(1) = –0.8886 Estimated odds ratio = e 0.8886 = 0.4112 Estimated Odds Ratio  0.2914 (1  Estimated Odds Ratio) The deviance statistic is 30.1325 with a p-value = 0.3082 > 0.05. Do not reject H0. The model is a good fitting model. For calls: ZSTAT = 2.1316, p-value = 0.0330 < 0.05. Reject H0. There is sufficient evidence that the number of calls the customer makes to the company call center makes a significant contribution to the logistic model. For visits: ZSTAT = 0.1806, p-value = 0.8567 > 0.05. Do not reject H0. There is not sufficient evidence that the number of visits the customer makes to the local service center makes a significant contribution to the logistic model.

Estimated Probability of the Event of Interest = (d) (e)

(f) Binary Logistic Regression

(g)

Predictor Intercept calls

Coefficients SE Coef Z -1.4702 0.6496 -2.2633 0.060084224 0.0267 2.249

Deviance

30.16532266

p -Value 0.0236 0.0245

PHStat output: Binary Logistic Regression

15.30

Predictor Intercept visits

Coefficients SE Coef Z 0.4786 0.5425 0.8822 -0.51392294 0.4231 -1.215

Deviance

40.06397737

p -Value 0.3777 0.2245

(h)

Since there is not sufficient evidence that the number of visits the customer makes to the local service center makes a significant contribution to the logistic model in (e), the model in (f) should be used.

(a) (b)

ln(estimated odds) = 1.252 – 0.0323 Age + 2.2165 subscribes to the wellness newsletters. Holding constant the effect of subscribes to the wellness newsletters, for each increase of one year in age, ln(estimated odds) decreases by an estimate of 0.0323. Holding constant the effect of age, for a customer who subscribes to the wellness newsletters, ln(estimated odds) increases by an estimate of 2.2165. 0.912 Deviance = 102.8762, p-value = 0.3264. Do not reject H0 so model is adequate.

(c) (d)

Copyright ©2024 Pearson Education, Inc.


ccxciv Chapter 16: Time-Series Forecasting (e)

For Age: Z = –1.8053 > –1.96, Do not reject H0. For subscribes to the wellness newsletters: Z = 4.3286 > 1.96, Reject H0.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccxcv 15.30 cont.

(f)

Only subscribes to wellness newsletters is useful in predicting whether a customer will purchase organic food.

15.31

In least squares regression the dependent variable is numerical. The use of categorical variables in least squares regression would violate the normality assumption and would not be appropriate with this method. Logistic regression allows one to predict a categorical dependent variable utilizing the odds ratio. The least squares regression uses a numerical dependent variable while logistic regression uses a categorical dependent variable.

15.32

In order to evaluate whether independent variables are intercorrelated, you can compute the Variance Inflationary Factor (VIF).

15.33

You use logistic regression when the dependent variable is a categorical variable.

15.34

One way to choose among models that meet these criteria is to determine whether the models contain a subset of variables that are common, and then test whether the contribution of the additional variables is significant.

15.35

From PHStat, Y = wins, X1 = runs per game, X2 = batting average, X3 = home runs, X4 = ERA, X5 = Saves, X6 = WHIP, X7 = OBS.

Coefficients

Standard Error

t Stat

P-value

Intercept

17.7302

26.6511

0.6653

0.5128

Runs per game

9.8491

4.6436

2.1210

Batting Average

-223.0903

150.0442

Home Runs

-0.1099

ERA

R Square

VIF

0.0454

0.9257

13.4578

-1.4868

0.1513

0.9018

10.1857

0.0468

-2.3497

0.0282

0.8815

8.4366

-11.1796

3.8947

-2.8704

0.0089

0.9335

15.0460

Saves

0.2735

0.1056

2.5913

0.0167

0.4565

1.8400

WHIP

-14.9478

21.0550

-0.7099

0.4852

0.9348

15.3361

OBS

207.2519

94.8418

2.1852

0.0398

0.9741

38.5477

As a first step in the model building process, a review of the VIFs for the seven independent variables reveals that one variable had values less than 5, which indicates that that variable is free from collinearity problems. The variable with the largest VIF, OBS, was removed before running a second regression analysis.

Intercept

Coefficients

Standard Error

58.8155

20.3803

t Stat

P-value

2.8859

0.0083

Copyright ©2024 Pearson Education, Inc.

R Square

VIF


ccxcvi Chapter 16: Time-Series Forecasting Runs per game

16.6449

3.7208

4.4735

0.0002

0.8653

7.4220

Batting Average

49.1552

90.2209

0.5448

0.5911

0.6839

3.1634

Home Runs

-0.0326

0.0330

-0.9870

0.3339

0.7228

3.6074

ERA

-11.5174

4.1989

-2.7429

0.0116

0.9334

15.0222

Saves

0.2446

0.1130

2.1648

0.0410

0.4478

1.8111

WHIP

-15.6068

22.7151

-0.6871

0.4989

0.9348

15.3330

After removing the variable with the largest VIF, OBS, a second regression analysis reveals only two variables, ERA (earned run average) and WHIP, with a VIF above five. The variable with the largest VIF, WHIP, was removed before running a third regression analysis. 15.35 cont. Coefficients

Standard Error

t Stat

P-value

R Square

VIF

Intercept

50.2879

15.9862

3.1457

0.0044

Runs per game

17.3811

3.5237

4.9326

0.0000

0.8531

6.8064

Batting Average

36.1803

87.2467

0.4147

0.6821

0.6694

3.0248

Home Runs

-0.0369

0.0320

-1.1505

0.2613

0.7125

3.4779

ERA

-14.1834

1.5868

-8.9386

0.0000

0.5441

2.1935

Saves

0.2464

0.1117

2.2055

0.0372

0.4476

1.8101

After removing the variables with the largest VIF, OBS and WHIP, a third regression analysis reveals only one variable, runs per game, with a VIF above five. Remove runs per game before running a fourth regression analysis. After running a fourth regression analysis, all of the variables, batting average (X1), home runs (X2), ERA (X3), and Saves (X4)all have VIF below 5. Regression Analysis Regression Statistics Multiple R

0.9589

R Square

0.9195

Adjusted R Square

0.9066

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccxcvii Standard Error

4.4862

Observations

30 Coefficients

Standard Error

t Stat

P-value

Intercept

31.6338

Batting Average

21.5962

1.4648

0.1554

376.4944

74.2517

5.0705

0.0000

0.1176

1.1333

0.0849

0.0284

2.9892

0.0062

0.2926

1.4135

ERA

-16.5869

2.0996

-7.8999

0.0000

0.4967

1.9867

Saves

0.2198

0.1552

1.4165

0.1690

0.4463

1.8059

Home Runs

R Square

VIF

The best-subset approach yielded: Cp = 5.0000 with X1X2X3X4 and adjusted r2 = 0.9066. (display of 3 smallest Cp values from PHStat) Model

Cp

k+1

R Square

Adj. R Square

Std. Error

X1X2X3

5.0064

4

0.9131

0.9030

4.5722

X1X3X4

11.9352

4

0.8908

0.8782

5.1253

X1X2X3X4

5.0000

5

0.9195

0.9066

4.4862

The most appropriate multiple regression model for predicting wins is: Yˆ  31.633  376.4944 X 1  0.0849 X 2  16.5869 X 3  0.2198 X 4 , for X1 = batting average, X2 = home runs, X3 = ERA, X4 = saves.

Copyright ©2024 Pearson Education, Inc.


ccxcviii Chapter 16: Time-Series Forecasting 15.35 cont.

Residuals

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccxcix 15.35 cont.

Residual plots against each of the four independent variables reveals no obvious pattern, which indicates insufficient evidence for violations in equal variance and linearity assumptions.

The normal probability plot indicates no evidence of a violation of the normality assumption.

Copyright ©2024 Pearson Education, Inc.


ccc Chapter 16: Time-Series Forecasting 15.36

A regression analysis with all three variables reveals that all had VIF values well below five with the largest VIF = 1.06.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccci 15.36 cont.

A best subsets regression reveals that the model with pressure and cost had the lowest Cp value and the highest adjusted r2.

A regression analysis on the two variable model with pressure and cost reveals that the pressure variable was not significant at the 0.05 significance level.

Copyright ©2024 Pearson Education, Inc.


cccii Chapter 16: Time-Series Forecasting 15.36 cont.

A regression analysis with cost as the only variable reveals a significant FSTAT for the overall model. The cost variable is significant at the 0.05 significance level. Because tSTAT = –2.81 or p-value = 0.016, reject H0. The r2 of 0.3971 indicates that 39.71% of variation in registration errors can be explained by the cost of the material.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccciii 15.36 cont.

Residual analyses reveal no clear patters, which indicate there appear to be no violations of assumptions. The best linear model, which contains only the cost as the predictor variable, is: Yˆ  1.77  14.23X where X = cost (low = 1). 15.37

Since the variable Rooms is the sum of Bathrooms, Bedrooms, Loft/Den and Finished Basement, it is removed from the list of potential independent variables. Including it will introduce perfect collinearity.

An analysis of the linear regression model all of the remaining seven possible independent variables revealed that none of the variables have VIF values in excess of 5.0.

Copyright ©2024 Pearson Education, Inc.


ccciv Chapter 16: Time-Series Forecasting 15.37 cont.

A best subsets regression produces the following potential models that have Cp values less than or equal to k+1. Model

Cp

k+1

R Square

Adj. R Square

Std. Error

X1X2X3X4X5

5.657125

6

0.532645

0.490157971

yes

X1X2X3X5X6

5.78991

6

0.531509

0.488919347

yes

X1X2X3X4X5X6

6.074978

7

0.546173

0.49574803

yes

X1X2X3X4X5X6X7

8

8

0.546814

0.486959627

yes

where X1 = Hot tub (0 = No and 1 = Yes), X2 = Lake View (0 = No and 1 = Yes), X3 = Bathrooms, X4 = Bedrooms , X5 = Loft/Den (0 = No and 1 = Yes), X6 = Finished Basement (0 = No and 1 = Yes), X7 = Acres. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccv Looking at the p-values of the t statistics for each slope coefficient of the model that includes X1 through X7 reveals that Bathrooms, Bedrooms, Loft/Den, Finished Basement and Acres are not significant at 5% level of significance.

Copyright ©2024 Pearson Education, Inc.


cccvi Chapter 16: Time-Series Forecasting 15.37 cont. Coefficients

Standard Error

t Stat

P-value

Intercept

56.86025281

75.83635651

0.749775641

0.456705242

Hot Tub

83.33704829

39.54169465

2.107574019

0.039815205

Lake View

188.1459004

46.55629756

4.041255648

0.000172817

Bathrooms

44.97723054

28.05985879

1.602902954

0.114899805

Bedrooms

31.7825021

24.30126221

1.307853963

0.196567371

Loft/Den

68.68786861

34.62218517

1.983926441

0.052453764

Finished Basement

42.40047126

35.03765953

1.210139942

0.231594833

Acres

8.214876036

30.00092343

0.273820773

0.7852867

Dropping Acres which has the highest p-value, the new regression indicates that Bathrooms, Bedrooms and Finished Basement are still not significant. Coefficients

Standard Error

t Stat

P-value

Intercept

63.04035916

71.77716439

0.878278763

0.383683708

Hot Tub

82.87369204

39.16564269

2.115979372

0.038973223

Lake View

190.4252197

45.41206354

4.193273877

0.000102798

Bathrooms

44.42625163

27.74686825

1.601126702

0.115183479

Bedrooms

31.82322517

24.09177116

1.320916796

0.192099589

Loft/Den

68.8302055

34.3204956

2.005513157

0.049930143

Finished Basement

43.67867801

34.42659968

1.268747957

0.209972962

Dropping Finished Basement, which has the largest p-value, the new regression indicatesthat Bathrooms and Bedrooms are still insignificant. Coefficients

Standard Error

t Stat

P-value

Intercept

33.50312632

68.27210075

0.490729389

0.625570141

Hot Tub

98.1802218

37.46721666

2.620430087

0.011330891

Lake View

181.7931221

45.14769931

4.026630921

0.000174862

Bathrooms

52.45485906

27.1649841

1.930973303

0.058647073

Bedrooms

36.99281947

23.87596476

1.549374856

0.127027169

Loft/Den

79.72730905

33.41209396

2.386181158

0.020493138

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccvii Dropping Bedrooms next, which has the largest p-value, the new regression indicates that all the remaining variables become significant at 5% level of significance. Coefficients

Standard Error

t Stat

P-value

Intercept

102.0014472

52.67086383

1.936582007

0.057848524

Hot Tub

89.12922256

37.4689494

2.378748911

0.020806217

Lake View

183.8349004

45.68930995

4.023586712

0.00017363

Bathrooms

77.69078822

22.01055013

3.529706789

0.000839855

Loft/Den

76.80439638

33.77336974

2.274111141

0.026811687

The best linear model is determined to be:

Yˆ  102.0014  89.1292X 1  183.8349X 2  77.6908X 3  76.8044X 5 . The overall model has FSTAT = 14.7030 (4 and 56 degrees of freedom) with a p-value that is 2 virtually 0. r2 = 0.5122, radj = 0.4774.

Copyright ©2024 Pearson Education, Inc.


cccviii Chapter 16: Time-Series Forecasting 15.37 cont.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccix 15.37 cont.

A residual analysis does not reveal any strong patterns and the normal probability plot does not suggest any departure from the normality assumption.

Copyright ©2024 Pearson Education, Inc.


cccx Chapter 16: Time-Series Forecasting 15.38

(a)

From PHStat, Y = asking price, X1 = house size, X2 = number of bedrooms, X3 = number of bathrooms, X4 = age.

Regression Analysis Regression Statistics Multiple R

0.9062

R Square

0.8211

Adjusted R Square

0.8086

Standard Error

196.8748

Observations

62

Coefficients

Standard Error

t Stat

P-value

Intercept

326.0868

142.2280

2.2927

0.0256

House Size

0.0154

0.0023

6.7138

Bedrooms

51.0930

28.5164

Bathrooms

163.9062

Age

-3.4459

R Square

VIF

0.0000

0.3151

1.4600

1.7917

0.0785

0.3707

1.5891

33.8509

4.8420

0.0000

0.4930

1.9723

0.9907

-3.4782

0.0010

0.2441

1.3229

Glen Cove: Based on a full regression model involving all of the variables: All VIFs are less than 5. So there is no reason to suspect collinearity between any pair of variables. The best-subset approach yielded: Cp = 5.0000 with X1X2X3X4 and adjusted r2 = 0.8086. Model

Cp

k+1

R Square

Adj. R Square

Std. Error

X1

90.3313

2

0.5345

0.5267

309.5501

X2

137.9638

2

0.3850

0.3747

355.7972

X3

65.1831

2

0.6134

0.6070

282.0916

X4

181.0711

2

0.2497

0.2372

392.9870

X1X2

61.2776

3

0.6319

0.6195

277.5695

X1X3

17.2209

3

0.7702

0.7624

219.3218

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccxi X1X4

38.9390

3

0.7020

0.6919

249.7391

X2X3

52.2616

3

0.6602

0.6487

266.6867

X2X4

104.0881

3

0.4976

0.4805

324.2977

X3X4

60.4780

3

0.6344

0.6220

276.6216

X1X2X3

15.0977

4

0.7831

0.7719

214.8859

X1X2X4

26.4450

4

0.7475

0.7345

231.8600

X1X3X4

6.2102

4

0.8110

0.8013

200.5909

X2X3X4

48.0757

4

0.6796

0.6631

261.1785

X1X2X3X4

5.0000

5

0.8211

0.8086

196.8748

The most appropriate multiple regression model for predicting asking price in Glen Cove is: Yˆ  326.0868  0.0154 X  51.0930 X  163.9062 X  3.4459 X , 1

2

3

4

for X1 = house size, X2 = number of bedrooms, X3 = number of bathrooms, X4 = age (b)

15.39

(a)

The adjusted r2 for the best model in 15.38(a), 15.39(a), and 15.40(a) are 0.8086, 0.4059, and 0.5227, respectively. The model in 15.38(a) has the highest explanatory power after adjusting for the number of independent variables and sample size. From PHStat, Y = asking price, X1 = house size, X2 = number of bedrooms, X3 = number of bathrooms, X4 = age.

Regression Analysis Regression Statistics Multiple R

0.6558

R Square

0.4301

Adjusted R Square

0.4039

Standard Error

264.5673

Observations

92

Intercept

Coefficients

Standard Error

t Stat

P-value

469.9116

194.4919

2.4161

0.0178

Copyright ©2024 Pearson Education, Inc.

R Square

VIF


cccxii Chapter 16: Time-Series Forecasting House Size

0.0336

0.0087

3.8755

0.0002

0.1080

1.1211

Bedrooms

37.5697

44.8910

0.8369

0.4049

0.2910

1.4104

Bathrooms

145.4441

49.9937

2.9092

0.0046

0.3579

1.5574

Age

-4.9885

1.2826

-3.8894

0.0002

0.0964

1.1067

Merrick: Based on a full regression model involving all of the variables: All VIFs are less than 5. So there is no reason to suspect collinearity between any pair of variables. The best-subset approach yielded: Cp = 3.7004 with X1X3X4 and adjusted r2 = 0.4059. Model

Cp

k+1

R Square

Adj. R Square

Std. Error

X1

42.1513

2

0.1474

0.1379

318.1553

X2

47.6019

2

0.1117

0.1018

324.7489

X3

25.5951

2

0.2559

0.2476

297.2314

X4

42.1917

2

0.1472

0.1377

318.2047

X1X2

30.0341

3

0.2399

0.2228

302.0883

X1X3

17.6092

3

0.3213

0.3060

285.4567

X1X4

16.4081

3

0.3292

0.3141

283.7972

X2X3

26.6559

3

0.2620

0.2454

297.6584

X2X4

34.0587

3

0.2135

0.1959

307.2825

X3X4

16.3915

3

0.3293

0.3142

283.7742

X1X2X3

18.1278

4

0.3310

0.3082

285.0144

X1X2X4

11.4637

4

0.3746

0.3533

275.5586

X1X3X4

3.7004

4

0.4255

0.4059

264.1165

X2X3X4

18.0199

4

0.3317

0.3089

284.8637

X1X2X3X4

5.0000

5

0.4301

0.4039

264.5673

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccxiii 15.39 cont.

(a)Using Stepwise Regression Analysis in PHStat (partial display) reveals the best model Coefficients

Standard Error

t Stat

P-value

Intercept

578.0637

145.1043

3.9838

0.0001

Bathrooms

166.2756

43.2829

3.8416

0.0002

Age

-5.0919

1.2744

-3.9954

0.0001

House Size

0.0332

0.0087

3.8394

0.0002

The most appropriate multiple regression model for predicting asking price in Merrick is: Yˆ  578.0637  0.0332 X 1  0 X 2  166.2756 X 3  5.0919 X 4 , for X1 = house size, X2 = number of bedrooms, X3 = number of bathrooms, X4 = age

15.40

(b)

The adjusted r2 for the best model in 15.38(a), 15.39(a), and 15.40(a) are 0.8086, 0.4059, and 0.5227, respectively. The model in 15.38(a) has the highest explanatory power after adjusting for the number of independent variables and sample size.

(a)

From PHStat, Y = asking price, X1 = house size, X2 = number of bedrooms, X3 = number of bathrooms, X4 = age.

Regression Analysis Regression Statistics Multiple R

0.7514

R Square

0.5646

Adjusted R Square

0.5162

Standard Error

190.3234

Observations

41

Coefficients

Standard Error

t Stat

P-value

Intercept

541.8804

238.8438

2.2688

0.0294

House Size

0.0121

0.0171

0.7077

Bedrooms

-47.0501

42.0582

-1.1187

R Square

VIF

0.4837

0.1198

1.1360

0.2707

0.4830

1.9343

Copyright ©2024 Pearson Education, Inc.


cccxiv Chapter 16: Time-Series Forecasting Bathrooms

247.7834

55.1422

4.4935

0.0001

0.5049

2.0197

Age

-3.7219

1.6632

-2.2378

0.0315

0.1559

1.1847

Bellmore: Based on a full regression model involving all of the variables: All VIFs are less than 5. So there is no reason to suspect collinearity between any pair of variables.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccxv 15.40 cont.

The best-subset approach yielded: Cp = 3.1084 with X3X4 and adjusted r2 = 0.5148.

(a)

Model

Cp

k+1

R Square

Adj. R Square

Std. Error

X1

38.8899

2

0.0821

0.0586

265.4922

X2

34.0716

2

0.1404

0.1183

256.9258

X3

7.0201

2

0.4676

0.4539

202.2018

X4

27.7709

2

0.2166

0.1965

245.2730

X1X2

29.4430

3

0.2205

0.1795

247.8496

X1X3

6.8207

3

0.4942

0.4675

199.6622

X1X4

27.6016

3

0.2428

0.2030

244.2829

X2X3

7.7427

3

0.4830

0.4558

201.8511

X2X4

23.8691

3

0.2880

0.2505

236.8886

X3X4

3.1084

3

0.5391

0.5148

190.5946

X1X2X3

8.0077

4

0.5040

0.4638

200.3659

X1X2X4

23.1919

4

0.3203

0.2652

234.5460

X1X3X4

4.2515

4

0.5494

0.5129

190.9690

X2X3X4

3.5009

4

0.5585

0.5227

189.0353

X1X2X3X4

5.0000

5

0.5646

0.5162

190.3234

Using Stepwise Regression Analysis in PHStat (partial display) reveals the best model Coefficients

Standard Error

t Stat

P-value

Intercept

535.3201

166.3062

3.2189

0.0026

Bathrooms

210.6304

40.8497

5.1562

0.0000

Age

-3.9061

1.6088

-2.4279

0.0200

The most appropriate multiple regression model for predicting asking price in Bellmore is: Yˆ  535.3201  0 X 1  0 X 2  210.6304 X 3  3.9061X 4 , for X1 = house size, X2 = number of bedrooms, X3 = number of bathrooms, X4 = age

Copyright ©2024 Pearson Education, Inc.


cccxvi Chapter 16: Time-Series Forecasting (b)

The adjusted r2 for the best model in 15.38(a), 15.39(a), and 15.40(a) are 0.8086, 0.4059, and 0.5227, respectively. The model in 15.38(a) has the highest explanatory power after adjusting for the number of independent variables and sample size.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccxvii 15.41

(a)

From PHStat, Y = asking price, X1 = house size, X2 = number of bedrooms, X3 = number of bathrooms, X4 = age, X5 = Glen Cove (1 = Glen Cove, 0 = Merrick).

Regression Analysis Regression Statistics Multiple R

0.7914

R Square

0.6264

Adjusted R Square

0.6138

Standard Error

241.2153

Observations

154

Coefficients

Standard Error

t Stat

P-value

Intercept

488.3431

122.9849

3.9708

0.0001

House Size

0.0172

0.0025

6.8999

Bedrooms

40.0918

26.3980

Bathrooms

159.7503

Age Glen Cove

R Square

VIF

0.0000

0.3350

1.5038

1.5187

0.1310

0.3312

1.4951

30.0709

5.3125

0.0000

0.4176

1.7169

-4.0414

0.8318

-4.8587

0.0000

0.1650

1.1976

-96.2654

44.0796

-2.1839

0.0305

0.1915

1.2369

GC and Merick: Based on a full regression model involving all of the variables: All VIFs are less than 5. So there is no reason to suspect collinearity between any pair of variables. The best-subset approach yielded: Cp = 6.0000 with X1X2X3X4X5 and adjusted r2 = 0.6138. Model

Cp

k+1

R Square

Adj. R Square

Std. Error

X1X2X3X4

8.7694

5

0.6143

0.6040

244.2473

X1X2X3X5

27.6072

5

0.5668

0.5552

258.8686

X1X2X4X5

32.2221

5

0.5551

0.5432

262.3263

X1X3X4X5

6.3066

5

0.6206

0.6104

242.2706

X2X3X4X5

51.6084

5

0.5062

0.4930

276.3792

Copyright ©2024 Pearson Education, Inc.


cccxviii Chapter 16: Time-Series Forecasting X1X2X3X4X5

6.0000

6

0.6264

0.6138

241.2153

The most appropriate multiple regression model for predicting asking price in Glen Cove and Merrick is: Yˆ  488.3431  0.0172 X 1  40.0918 X 2  159.7503 X 3  4.0414 X 4  96.2654 X 5 , for X1 = house size, X2 = number of bedrooms, X3 = number of bathrooms, X4 = age, X5 = Glen Cove (1 = Glen Cove, 0 = Merrick). (b)

The asking price in Glen Cove is $96.2654 thousands below Merrick for two otherwise identical properties.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccxix 15.42

(a)

From PHStat, Y = asking price, X1 = house size, X2 = number of bedrooms, X3 = number of bathrooms, X4 = age, X5 = Glen Cove (1 = Glen Cove, 0 = otherwise), X6 = Merrick (1 = Merrick, 0 = otherwise).

Regression Analysis Regression Statistics Multiple R

0.7914

R Square

0.6264

Adjusted R Square

0.6138

Standard Error

241.2153

Observations

154

Coefficients

Standard Error

t Stat

P-value

Intercept

488.3431

122.9849

3.9708

0.0001

House Size

0.0172

0.0025

6.8999

Bedrooms

40.0918

26.3980

Bathrooms

159.7503

Age Glen Cove

R Square

VIF

0.0000

0.3350

1.5038

1.5187

0.1310

0.3312

1.4951

30.0709

5.3125

0.0000

0.4176

1.7169

-4.0414

0.8318

-4.8587

0.0000

0.1650

1.1976

-96.2654

44.0796

-2.1839

0.0305

0.1915

1.2369

GC and Merick: Based on a full regression model involving all of the variables: All VIFs are less than 5. So there is no reason to suspect collinearity between any pair of variables. The best-subset approach yielded: Cp = 4.1575 with X1X3X4X6 and adjusted r2 = 0.6196. (display of 3 smallest Cp values from PHStat) Model

Cp

k+1

R Square

Adj. R Square

Std. Error

X1X3X4X6

4.1575

5

0.6274

0.6196

230.6767

X1X2X3X4X6

5.0000

6

0.6297

0.6199

230.5774

X1X3X4X5X6

6.1444

6

0.6275

0.6176

231.2781

Using Stepwise Regression Analysis in PHStat (partial display) reveals the best model Copyright ©2024 Pearson Education, Inc.


cccxx Chapter 16: Time-Series Forecasting

(b)

Coefficients

Standard Error

t Stat

P-value

Intercept

482.1616

86.1384

5.5975

0.0000

Bathrooms

184.9559

23.1061

8.0046

0.0000

House Size

0.0176

0.0022

8.1890

0.0000

Age

-4.0378

0.7319

-5.5171

0.0000

Merrick

100.9783

34.6406

2.9150

0.0040

The most appropriate multiple regression model for predicting asking price in Glen Cove, Merrick, and Bellmore is: Yˆ  482.1616  0.0176 X 1  0 X 2  184.9559 X 3  4.0378 X 4  0 X 5  100.9783 X 6 , for X1 = house size, X2 = number of bedrooms, X3 = number of bathrooms, X4 = age, X5 = Glen Cove (1 = Glen Cove, 0 = otherwise), X6 = Merrick (1 = Merrick, 0 = otherwise). The asking price in Merrick is $100.9873 thousands above Glen Cove or Bellmore for two otherwise identical properties.

15.43

As a first step in the model building process, a review of the VIFs for the seven independent variables reveals that all variables had VIF values less than 5, which indicates that these variables are free from collinearity problems.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccxxi

A Best Subsets analysis reveals that all models had Cp values below or at k+1, where k represents the number of independent variables. The model with the highest adjusted r2 is the model with growth as the only independent variable. A stepwise regression analysis also produced a model with growth as the only independent variable.

Copyright ©2024 Pearson Education, Inc.


cccxxii Chapter 16: Time-Series Forecasting 15.43 cont.

There was insufficient evidence for a relationship between growth and price-to-book value ratio. Because FSTAT = 2.59 or p-value = 0.113, do not reject H0.

A normal probability plot reveals deviation from the normality assumption. To correct for the deviation from the normality assumption, a natural log transformation was performed on the price-to-book value ratio dependent variable. However, one of the rows contained a negative price-to-book value ratio. This case, row 60, was removed before performing the transformation. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccxxiii 15.43 cont.

A review of the VIFs for the seven independent variables reveals that all variables had VIF values less than 5, which indicates that these variables are free from collinearity problems.

A Best Subsets analysis with the natural log transformation of the dependent variable reveals that most of the models had Cp values below or at k+1, where k represents the number of independent variables. The model with the highest adjusted r2 is the model with growth as the only independent variable.

Copyright ©2024 Pearson Education, Inc.


cccxxiv Chapter 16: Time-Series Forecasting 15.43 cont.

A regression analysis with the transformed price-to-book value ratio dependent variable and the growth independent variable, growth, reveals a significant FSTAT for the overall model. The growth coefficient had a significant tSTAT at the 0.05 significance level. The r2 of 0.0995 indicates that 9.95% of variation in the natural log of the price-to-book value ratio can be explained by the variation in growth. The regression equation for this model is ln(Ŷ) = 1.590 – 0.0128(X1), where X1=growth.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccxxv 15.43 cont.

Residual plots against the growth independent variable reveals no obvious pattern, which indicates insufficient evidence for violations in equal variance and linearity assumptions. The normal probability plot reveals no evidence of deviation from the normality assumption with this model. The present model using the transformed price-to-book value ratio dependent variable and growth as the single independent variable represents the one model associated with a significant FSTAT for the overall model. However, it is relatively weak in that in can account for less than 10% of the variation in the dependent variable. In general, none of the variables in the dataset appeared to be very predictive of the price-to-book value ratio.

15.44

In the multiple regression model with catalyst, pH, pressure, temperature and voltage as independent variables, none of the variables have a VIF value of 5 or larger.

The best-subset approach yielded only the following model to be considered: Model X1X2X3X4X5

Cp

k+1

R Square

Adj. R Square

Std. Error

6

0.875922

0.861822068

1.293575

6

where X1 = catalyst, X2 = pH, X3 = pressure, X4 = temp, and X5 = voltage. Looking at the p-values of the t statistics for each slope coefficient of the model that includes X1 through X5 reveals that pH level is not significant at 5% level of significance. Coefficients

Standard Error

t Stat

P-value

Intercept

4.454255233

8.222983547

0.541683588

0.590769119

Catalyst

0.162669323

0.036277562

4.484020293

5.18724E-05

Copyright ©2024 Pearson Education, Inc.


cccxxvi Chapter 16: Time-Series Forecasting pH

0.086375011

0.080013101

1.079510851

0.286242198

Pressure

–0.043059299

0.013464369

–3.198018263

0.002564899

Temp

–0.402556214

0.069704281

–5.775200729

7.21416E-07

Voltage

0.422370024

0.028413318

14.86521277

9.13658E-19

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccxxvii 15.44 cont.

The multiple regression model with pH level deleted shows that all coefficients are significant individually at 5% level of significance. Coefficients

Standard Error

t Stat

P-value

Intercept

3.683340948

8.206951065

0.44880747

0.655724457

Catalyst

0.154754083

0.035594069

4.347749199

7.77444E-05

Pressure

–0.041971526

0.013451255

–3.120268445

0.003150939

Temp

–0.4035674

0.069825915

–5.779622062

6.62469E-07

Voltage

0.428756573

0.027841579

15.3998654

1.47975E-19

The best linear model is determined to be:

Yˆ  3.6833  0.1548X1  0.04197X 3  0.4036X 4  0.4288X 5 . The overall model has F = 77.0793 (4 and 45 degrees of freedom) with a p-value that is virtually 2

2

0. r = 0.8726, radj = 0.8613. The normal probability plot does not suggest possible violation of the normality assumption. A residual analysis reveals a potential non-linear relationship in temperature.

Copyright ©2024 Pearson Education, Inc.


cccxxviii Chapter 16: Time-Series Forecasting 15.44 cont.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccxxix 15.44 cont.

The p-value of the squared term for temperature in the following quadratic transformation of temperature does not support the need for a quadratic transformation at the 5% level of significance. Coefficients

Standard Error

t Stat

P-value

Intercept

–322.0541757

209.7341228

–1.535535426

0.131812685

Catalyst

0.16474942

0.035632189

4.623612047

3.30582E-05

Pressure

–0.044452151

0.01334035

–3.332157868

0.001753581

Temp

7.27648367

4.9417966

1.472436901

0.148020497

Temp Squared

–0.04508917

0.029010216

–1.554251433

0.127288634

Voltage

0.424662847

0.027539942

15.41988889

2.34994E-19

The p-value of the interaction term between pressure and temperature below indicates that there is not enough evidence of an interaction at the 5% level of significance. Coefficients

Standard Error

t Stat

P-value

Intercept

103.5523674

55.92763384

1.851542078

0.070809822

Catalyst

0.144935857

0.035157935

4.122422311

0.000163315

Pressure

–0.859424944

0.453254189

–1.896121349

0.064522645

Temp

–1.586885548

0.659370559

–2.406667277

0.020363651

Pressure x Temp

0.009640768

0.005343284

1.804277709

0.078035623

Voltage

0.431941042

0.027226309

15.86484039

8.10114E-20

The best model is still the one that includes catalyst, pressure, temperature and voltage which manages to explain 87.26% of the variation in thickness. 15.45

(a)

In the multiple regression model with all the possible independent variables, none of the variables have a VIF value of 5 or larger. The best-subset approach yielded only the following models to be considered: Model

Cp

k+1

R Square

Adj. R Square

Std. Error

Consider This Model?

X1X2

2.4647

3

0.6124

0.5668

0.0365

Yes

X1X2X3

3.6599

4

0.6313

0.5622

0.0367

Yes

X1X2X4

2.9741

4

0.6475

0.5814

0.0359

Yes

X1X2X3X4

4.1693

5

0.6664

0.5775

0.0360

Yes

Copyright ©2024 Pearson Education, Inc.


cccxxx Chapter 16: Time-Series Forecasting X1X2X4X5

4.8048

5

0.6515

0.5585

0.0368

Yes

X1X2X3X4X5

6.0000

6

0.6704

0.5527

0.0371

Yes

where X1 = Time, X2 = Pressure, X3 = Amplitude, X4 = Density, and X5 = Quantity. The model with the highest adjusted r-square contains X1, X2, X4 and yields the following results: Intercept Time Pressure Density

Coefficients Standard Error t Stat P-value 0.0884 0.0816 1.0835 0.2946 0.0022 0.0004 4.8180 0.0002 0.0013 0.0006 2.1406 0.0481 0.2263 0.1793 1.2620 0.2250

Copyright ©2024 Pearson Education, Inc.

Lower 95% Upper 95% -0.0845 0.2613 0.0012 0.0031 0.0000 0.0025 -0.1538 0.6063


Solutions to End-of-Section and Chapter Review Problems cccxxxi (a)

Looking at the p-value of the t-test statistic for each slope coefficient reveals that X4 is not significant at 5% level of significance. The results after dropping X4 follow: Regression Statistics Multiple R 0.7826 R Square 0.6124 Adjusted R Square 0.5668 Standard Error 0.0365 Observations 20 ANOVA df Regression Residual Total

SS 2 17 19

MS F Significance F 0.0357 0.0179 13.4294 0.0003 0.0226 0.0013 0.0584

Coefficients Standard Error t Stat P-value 0.1789 0.0395 4.5242 0.0003 0.0022 0.0005 4.7362 0.0002 0.0013 0.0006 2.1042 0.0505

Intercept Time Pressure

Lower 95% Upper 95% 0.0955 0.2623 0.0012 0.0031 0.0000 0.0026

The p-value of the t-test statistic for the slope coefficient for pressure is only slightly above 5%. For parsimony consideration, you might drop pressure from the model. The best model is Yˆ  0.1789  0.0022X 1  0.0013X 2 The normal probability plot suggests possible departure from the normality assumption. The residual plots do not reveal any specific pattern.

Normal Probability Plot 0.08

0.06 0.04

Residual

15.45 cont.

0.02 0 -0.02 -0.04 -0.06 -2

-1

0 Z Value

1

Copyright ©2024 Pearson Education, Inc.

2


cccxxxii Chapter 16: Time-Series Forecasting (a) Residual Plot for Time 0.08

0.06

Residuals

0.04 0.02 0 -0.02 -0.04 -0.06 0

20

40

60

80

100

60

80

Time

Residual Plot for Pressure 0.08

0.06 0.04

Residuals

15.45 cont.

0.02 0 -0.02 -0.04 -0.06 0

(b)

20

40 Pressure

In the multiple regression model with all the possible independent variables, none of the variables have a VIF value of 5 or larger. The best-subset approach yielded only the following model to be considered: Model

Cp

k+1

R Square

Adj. R Square

Std. Error

Consider This Model?

X1X4

2.8943

3

0.5574

0.5053

0.0393

Yes

X1X2X4

2.2854

4

0.6258

0.5556

0.0373

Yes

X1X2X3X4

4.0090

5

0.6330

0.5351

0.0381

Yes

X1X2X4X5

4.2764

5

0.6260

0.5263

0.0385

Yes

X1X2X3X4X5

6.0000

6

0.6332

0.5023

0.0395

Yes

where X1 = Time, X2 = Pressure, X3 = Amplitude, X4 = Density, and X5 = Quantity.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccxxxiii (b)

The model with the highest adjusted r-square contains X1, X2, X4 and yields the following results: Intercept Time Pressure Density

Coefficients Standard Error 0.0146 0.0848 0.0020 0.0005 0.0011 0.0006 0.4588 0.1865

t Stat P-value 0.1715 0.8660 4.2165 0.0007 1.7094 0.1067 2.4602 0.0256

Looking at the p-value of the t-test statistic for each slope coefficient reveals that X2 is not significant at 5% level of significance. The results after dropping X2 follow: Regression Statistics Multiple R 0.7466 R Square 0.5574 Adjusted R Square 0.5053 Standard Error 0.0393 Observations 20 ANOVA df Regression Residual Total

SS 2 17 19

MS F Significance F 0.0331 0.0166 10.7053 0.0010 0.0263 0.0015 0.0595

Coefficients Standard Error t Stat P-value 0.0624 0.0845 0.7380 0.4706 0.0020 0.0005 3.9966 0.0009 0.4588 0.1967 2.3319 0.0323

Intercept Time Density

Lower 95% Upper 95% -0.1159 0.2406 0.0009 0.0030 0.0437 0.8738

The p-value of the t-test statistic for all the slope coefficients is < 5%. The best model is Yˆ  0.0624  0.0020 X 1  0.4588X 4 The normal probability plot suggests possible slight departure from the normality assumption. The residual plots suggest possible violation of the equal variances assumption. Normal Probability Plot 0.08

0.06 0.04

Residual

15.45 cont.

0.02 0 -0.02 -0.04 -0.06 -2

-1

0 Z Value

1

2

Copyright ©2024 Pearson Education, Inc.


cccxxxiv Chapter 16: Time-Series Forecasting (b) Residual Plot for Time 0.08

0.06

Residuals

0.04 0.02 0 -0.02 -0.04 -0.06 0

20

40

60

80

100

0.4

0.5

Time

Residual Plot for Density 0.08

0.06 0.04 Residuals

15.45 cont.

0.02 0 -0.02 -0.04 -0.06 0

(c)

(d)

0.1

0.2 0.3 Density

The most appropriate model to predict the product length in cavity 1 includes time and pressure while the most appropriate model to predict the product length in cavity 2 includes time and density. In the multiple regression model with all the possible independent variables, none of the variables have a VIF value of 5 or larger.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccxxxv 15.45 cont.

(d)

The best-subset approach yielded only the following model to be considered: Model

Cp

k+1

R Square

Adj. R Square

Std. Error

Consider This Model?

X1

0.1360

2

0.2444

0.2025

0.1737

Yes

X1X2

1.9472

3

0.2533

0.1654

0.1777

Yes

X1X3

2.1345

3

0.2445

0.1556

0.1788

Yes

X1X4

1.3053

3

0.2833

0.1990

0.1741

Yes

X1X5

1.0212

3

0.2966

0.2139

0.1725

Yes

X1X2X3

3.9456

4

0.2533

0.1134

0.1832

Yes

X1X2X4

3.1164

4

0.2922

0.1595

0.1784

Yes

X1X2X5

2.8323

4

0.3055

0.1753

0.1767

Yes

X1X3X4

3.3037

4

0.2834

0.1490

0.1795

Yes

X1X3X5

3.0196

4

0.2967

0.1648

0.1778

Yes

X1X4X5

2.1904

4

0.3355

0.2109

0.1728

Yes

X1X2X3X5

4.8307

5

0.3056

0.1204

0.1825

Yes

X1X2X4X5

4.0016

5

0.3444

0.1695

0.1773

Yes

X1X3X4X5

4.1889

5

0.3356

0.1584

0.1785

Yes

X1X2X3X4X5

6.0000

6

0.3445

0.1103

0.1835

Yes

where X1 = Time, X2 = Pressure, X3 = Amplitude, X4 = Density, and X5 = Quantity. The model with the highest adjusted r-square contains X1, and X5, and yields the following results: Intercept Time Quantity

Coefficients Standard Error t Stat P-value 0.3724 0.2025 1.8392 0.0834 0.0052 0.0022 2.4306 0.0264 0.0484 0.0431 1.1233 0.2769

Looking at the p-value of the t-test statistic for each slope coefficient reveals that X5 is not significant at 5% level of significance. The results after dropping X5 follow:

Copyright ©2024 Pearson Education, Inc.


cccxxxvi Chapter 16: Time-Series Forecasting Regression Statistics Multiple R 0.4944 R Square 0.2444 Adjusted R Square 0.2025 Standard Error 0.1737 Observations 20 ANOVA df Regression Residual Total

Intercept Time

(d)

1 18 19

MS 0.1758 0.1758 0.5433 0.0302 0.7191

F Significance F 5.8231 0.0267

Coefficients Standard Error t Stat P-value 0.5420 0.1360 3.9858 0.0009 0.0052 0.0022 2.4131 0.0267

Lower 95% Upper 95% 0.2563 0.8276 0.0007 0.0098

The p-value of the t-test statistic for the significance of time is < 5%. The best model is Yˆ  0.5420  0.0052X 1 None of the observations have a Cook’s Di > F  0.8212 with d.f. = 3 and 17. Hence, using the Studentized deleted residuals, hat matrix diagonal elements and Cook’s distance statistic together, there is insufficient evidence for removal of any observation from the model. The normal probability plot suggests possible slight departure from the normality assumption. The residual plots suggest possible violation of the equal variances assumption.

Normal Probability Plot 0.4 0.3 0.2

Residuals

15.45 cont.

SS

0.1 0 -0.1 -0.2 -0.3 -0.4 -2

-1

0 Z Value

1

2

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccxxxvii Residual Plot 0.4 0.3

Residuals

0.2 0.1 0 -0.1 -0.2 -0.3 -0.4 0

20

40

60

80

100

X

(e)

In the multiple regression model with all the possible independent variables, none of the variables have a VIF value of 5 or larger.

Copyright ©2024 Pearson Education, Inc.


cccxxxviii Chapter 16: Time-Series Forecasting 15.45 cont.

(e)

The best-subset approach yielded only the following model to be considered:

Model

Cp

k+1

R Square

Adj. R Square

Std. Error

Consider This Model?

X1

–0.6667

2

0.2023

0.1580

0.1831

Yes

X1X2

1.1217

3

0.2133

0.1208

0.1871

Yes

X1X3

1.3164

3

0.2032

0.1094

0.1883

Yes

X1X4

1.0492

3

0.2171

0.1250

0.1867

Yes

X1X5

0.5125

3

0.2450

0.1562

0.1833

Yes

X1X2X3

3.1049

4

0.2142

0.0668

0.1928

Yes

X1X2X4

2.8376

4

0.2281

0.0834

0.1911

Yes

X1X2X5

2.3009

4

0.2560

0.1165

0.1876

Yes

X1X3X4

3.0323

4

0.2180

0.0713

0.1923

Yes

X1X3X5

2.4956

4

0.2459

0.1045

0.1888

Yes

X1X4X5

2.2284

4

0.2598

0.1210

0.1871

Yes

X1X2X3X4

4.8208

5

0.2290

0.0234

0.1972

Yes

X1X2X3X5

4.2841

5

0.2569

0.0587

0.1936

Yes

X1X2X4X5

4.0168

5

0.2708

0.0763

0.1918

Yes

X1X3X4X5

4.2115

5

0.2607

0.0635

0.1931

Yes

X1X2X3X4X5

6.0000

6

0.2717

0.0115

0.1984

Yes

where X1 = Time, X2 = Pressure, X3 = Amplitude, X4 = Density, and X5 = Quantity. The model with the highest adjusted r-square contains X1 and yields the following results:

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccxxxix Regression Statistics Multiple R 0.4498 R Square 0.2023 Adjusted R Square 0.1580 Standard Error 0.1831 Observations 20

Note: This worksheet does not recalculate. If regression data changes, rerun procedure to create an updated version of this worksheet.

ANOVA df Regression Residual Total

Intercept Time

SS 1 18 19

0.1531 0.6036 0.7567

Coefficients Standard Error 0.5673 0.1433 0.0049 0.0023

MS 0.1531 0.0335

F Significance F 4.5650 0.0466

t Stat P-value 3.9586 0.0009 2.1366 0.0466

Lower 95% Upper 95% 0.2662 0.8684 0.0001 0.0097

Looking at the p-value of the t-test statistic for the slope coefficient reveals that X1 is significant at 5% level of significance. The best model is Yˆ  0.5673  0.0049X 1

Copyright ©2024 Pearson Education, Inc.


cccxl Chapter 16: Time-Series Forecasting (e)

None of the observations have a Cook’s Di > F  0.8212 with d.f. = 3 and 17. Hence, using the Studentized deleted residuals, hat matrix diagonal elements and Cook’s distance statistic together, there is insufficient evidence for removal of any observation from the model. The normal probability plot does not suggest any departure from the normality assumption. The residual plots suggest possible violation of the equal variances assumption. Normal Probability Plot 0.4 0.3

Residuals

0.2 0.1 0 -0.1 -0.2 -0.3 -0.4 -2

-1

0 Z Value

1

2

80

100

Residual Plot 0.4 0.3 0.2

Residuals

15.45 cont.

0.1 0 -0.1 -0.2 -0.3 -0.4 0

20

40

60 X

(f)

The most appropriate model to predict the product weight in both cavity 1 and cavity 2 contains only the variable time. A slightly higher percentage of variation in product weight is explained by the variation in time for cavity 1 as compared to cavity 2 for the r2 = 0.2444 for cavity 1 while r2 = 0.2023 for cavity 2.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccxli 15.46

From PHStat, Y = average annual salary, X1 = unemployment rate, X2 = median home price, X3 = violent crime per 100,000 residents, X4 = average commute time, X5 = livability score. Coefficients

Intercept

Standard Error

-8258.7031 15018.7217

t Stat

P-value

-0.5499

0.5888

R Square

VIF

Unemployment Rate

-92.5228

853.7659

-0.1084

0.9148

0.3720

1.5925

Median Home Price

0.0274

0.0044

6.2698

0.0000

0.4337

1.7658

Violent Crime per 100,000 residents

0.3855

9.0627

0.0425

0.9665

0.4667

1.8752

Average Commute Time

763.1739

449.6812

1.6971

0.1060

0.6725

3.0534

Livability Score

608.8310

207.4193

2.9353

0.0085

0.1030

1.1148

A regression analysis revealed that all five variables had VIF values below 5. So there is no reason to suspect collinearity between any pair of variables. A best subsets regression analysis reveals that a three-variable model with median home price, commuter time, and livability score has the lowest Cp value and the highest adjusted r2. The best-subset approach yielded: Cp = 2.0157 with X2X4X5 and adjusted r2 = 0.8543. (PHStat display of 3 smallest Cp) Model

Cp

k+1

R Square

Adj. R Square

Std. Error

X2X4X5

2.0157

4

0.8725

0.8543 4473.4675

X1X2X4X5

4.0018

5

0.8726

0.8471 4582.2602

X2X3X4X5

4.0117

5

0.8725

0.8470 4583.4579

Using Stepwise Regression Analysis in PHStat (partial display) reveals the best model t Stat

P-value

13150.0706

-0.6772

0.5056

0.0272

0.0039

6.9839

0.0000

Livability Score

613.1958

194.1231

3.1588

0.0047

Average Commute Time

760.2396

295.8934

2.5693

0.0179

Intercept Median Home Price

Coefficients

Standard Error

-8905.7247

Copyright ©2024 Pearson Education, Inc.


cccxlii Chapter 16: Time-Series Forecasting The most appropriate multiple regression model for predicting average annual salary is: Yˆ  8905.7247  0 X 1  0.0272 X 2  0 X 3  760.2396 X 4  613.1958 X 5 , for X1 = unemployment rate, X2 = median home price, X3 = violent crime per 100,000 residents, X4 = average commute time, X5 = livability score. The adjusted r2 for the best model is 0.8543 and the r2 for the model is 0.8725, so the variation in average annual salary can be explained by variation in median home price, variation in average commuting time, and variation in livability score.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccxliii 15.46Residual analyses reveal no potential violations in assumptions. cont.

Copyright ©2024 Pearson Education, Inc.


cccxliv Chapter 16: Time-Series Forecasting

15.46 cont.

15.47

From PHStat, Y = wins, X1 = field goal percentage, X2 = three-point percentage, X3 = free throw percentage, X4 = rebounds, X5 = assists, X6 = turnovers. Coefficients

Standard Error

t Stat

P-value

Copyright ©2024 Pearson Education, Inc.

R Square

VIF


Solutions to End-of-Section and Chapter Review Problems cccxlv Intercept

-338.4853

82.2815

-4.1137

0.0004

Field Goal Percentage

223.6901

136.5881

1.6377

0.1151

0.6174

2.6134

Three-Point Percentage

322.2388

118.8202

2.7120

0.0124

0.4428

1.7947

Free Throw Percentage

79.4471

53.5478

1.4837

0.1515

0.2857

1.4000

Rebounds

2.8376

0.8781

3.2315

0.0037

0.1763

1.2140

Assists

0.0694

0.9365

0.0741

0.9416

0.4282

1.7488

Turnovers

-1.9662

1.4960

-1.3143

0.2017

0.2357

1.3084

A regression analysis revealed that all six variables had VIF values below 5. So, there is no reason to suspect collinearity between any pair of variables. A best subsets regression analysis reveals that a four-variable model with field-goal percentage, commuter time, and livability score has the lowest Cp value and the highest adjusted r2.

Copyright ©2024 Pearson Education, Inc.


cccxlvi Chapter 16: Time-Series Forecasting 15.47 cont.

The best-subset approach yielded: Cp = 4.7442 with X1X2X3X4 and adjusted r2 = 0.6722. (PHStat display of 3 smallest Cp) Model

Cp

k+1

R Square

Adj. R Square

Std. Error

X1X2X3X4

4.7442

5

0.7174

0.6722

6.6098

X1X2X4X6

5.2054

5

0.7121

0.6661

6.6711

X1X2X3X4X6

5.0055

6

0.7373

0.6825

6.5048

Using Stepwise Regression Analysis in PHStat (partial display) reveals the best model Coefficients

Standard Error

Intercept

-360.9347

Three-Point Percentage Rebounds Free Throw Percentage

t Stat

P-value

61.7701

-5.8432

0.0000

481.6958

99.9288

4.8204

0.0001

3.1252

0.8689

3.5967

0.0013

119.5796

52.2259

2.2897

0.0304

The most appropriate multiple regression model for predicting wins is: Yˆ  360.9347  0 X 1  481.6958 X 2  119.5796 X 3  3.1252 X 4  0 X 5  0 X 6 , for X1 = field goal percentage, X2 = three-point percentage, X3 = free throw percentage, X4 = rebounds, X5 = assists, X6 = turnovers. Residual analyses reveal no potential violations in assumptions.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccxlvii

Copyright ©2024 Pearson Education, Inc.


cccxlviii Chapter 16: Time-Series Forecasting 15.47 cont.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccxlix 15.47Residual analyses reveal no potential violations in the equal variance assumption. cont.However, there appears to be some deviation from the normality assumption.

15.48

All analyses and associated summaries are provided with problems 15.38–15.42.

Chapter 16

16.1

Use the smoothed value for this year: 41.3 million of constant this year dollars.

16.2

(a)

(b)

16.3

(a)

Since you need data from four prior years to obtain the centered 9-year moving average for any given year and since the first recorded value is for 1984, the first centered moving average value you can calculate is for 1988. You would lose four years for the period 1984–1987 since you do not have enough past values to compute a centered moving average. You will also lose the final four years of recorded time series since you do not have enough later values to compute a centered moving average. Therefore, you will lose a total of eight years in computing a series of 9year moving averages. E2022 = (0.20)(12.3) + (0.80)(9.5) = 10.06 Copyright ©2024 Pearson Education, Inc.


cccl Chapter 16: Time-Series Forecasting

16.4

(b)

E2023  0.20Y2023  0.80 E2022   0.20 12.6    0.80 10.06   10.57

(a)

Times Series Plot: Enrollment vs Year

(b)

Three-year moving average Year Enrollment MA(3) 2011

956

#N/A

2012

933

935.3333

2013

917

936.6667

2014

960

927.6667

2015

906

955.0000

2016

999

960.0000

2017

975

958.6667

2018

902

933.6667

2019

924

930.6667

2020

966

959.3333

2021

988

973.0000

2022

965

976.5000

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccli 16.4 cont.

(b)

Three-year moving average for Enrollment

(c)

W = 0.50, exponentially smooth the series Year

Enrollment

ES(0.50)

2011

956

956.0000

2012

933

944.5000

2013

917

930.7500

2014

960

945.3750

2015

906

925.6875

2016

999

962.3438

2017

975

968.6719

2018

902

935.3359

2019

924

929.6680

2020

966

947.8340

2021

988

967.9170

2022

965

966.4585

Copyright ©2024 Pearson Education, Inc.


ccclii Chapter 16: Time-Series Forecasting

16.4 cont.

(d) (e)

Yˆ2023  E2022  966.4585 W = 0.25, exponentially smooth the series Year

Enrollment

ES(0.25)

2011

956

956.0000

2012

933

950.2500

2013

917

941.9375

2014

960

946.4531

2015

906

936.3398

2016

999

952.0049

2017

975

957.7537

2018

902

943.8152

2019

924

938.8614

2020

966

945.6461

2021

988

956.2346

2022

965

958.4259

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccliii

(f)

(g)

Yˆ2023  E2022  958.4259 The exponential smoothing with W = 0.5 assigns more weight to the more recent values and is better for short-term forecasting, while the exponential smoothing with W = 0.25, which assigns more weight to more distant values, is better suited for identifying longterm tendencies. There is no perceptible trend in the annual enrollment in the introductory business statistics courses from 2011-2022. There has been a consistent enrollment of 950 students.

Copyright ©2024 Pearson Education, Inc.


cccliv Chapter 16: Time-Series Forecasting 16.5

(a)

Time Series Plot: Accounting Majors: Major vs Year

(b)

Three-year moving average for Majors Year

Majors

MA(3)

2011

283

#N/A

2012

290

294.0000

2013

309

321.3333

2014

365

330.0000

2015

316

323.0000

2016

288

310.0000

2017

326

303.3333

2018

296

330.6667

2019

370

326.0000

2020

312

340.0000

2021

338

309.6667

2022

279

308.5000

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccclv

Copyright ©2024 Pearson Education, Inc.


ccclvi Chapter 16: Time-Series Forecasting 16.5 cont.

(c)

W = 0.50, exponentially smooth the series Year

Majors

ES(0.50)

2011

283

283.0000

2012

290

286.5000

2013

309

297.7500

2014

365

331.3750

2015

316

323.6875

2016

288

305.8438

2017

326

315.9219

2018

296

305.9609

2019

370

337.9805

2020

312

324.9902

2021

338

331.4951

2022

279

305.2476

Exponenially Smoothed Majors, ES(0.50) W = 0.50 400 350 300 250 200 150 100 50 0 2010

2012

2014

2016

2018

2020

Year Majors

(d)

ES(0.50)

Yˆ2023  E2022  305.2476

Copyright ©2024 Pearson Education, Inc.

2022

2024


Solutions to End-of-Section and Chapter Review Problems ccclvii 16.5 cont.

(e)

(f)

W = 0.25, exponentially smooth the series Year

Majors

ES(0.25)

2011

283

283.0000

2012

290

284.7500

2013

309

290.8125

2014

365

309.3594

2015

316

311.0195

2016

288

305.2646

2017

326

310.4485

2018

296

306.8364

2019

370

322.6273

2020

312

319.9705

2021

338

324.4778

2022

279

313.1084

Yˆ2023  E2022  313.1084 The exponential smoothing with W = 0.5 assigns more weight to the more recent values and is better for short-term forecasting, while the exponential smoothing with W = 0.25, Copyright ©2024 Pearson Education, Inc.


ccclviii Chapter 16: Time-Series Forecasting

(g)

which assigns more weight to more distant values, is better suited for identifying longterm tendencies. There is no perceptible trend in annual number of declared accounting majors from 20112022. There has been a consistent number of majors of 313 students.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccclix 16.6

(a)

Time Series Plot: Stock Performance, Decade vs Performance (%)

(b)

Three-period moving average Decade

Performance

MA(3)

1830s

2.8

#N/A

1840s

12.8

7.4000

1850s

6.6

10.6333

1860s

12.5

8.8667

1870s

7.5

8.6667

1880s

6.0

6.3333

1890s

5.5

7.4667

1900s

10.9

6.2000

1910s

2.2

8.8000

1920s

13.3

4.4333

1930s

-2.2

6.9000

1940s

9.6

8.5333

1950s

18.2

12.0333

Copyright ©2024 Pearson Education, Inc.


ccclx Chapter 16: Time-Series Forecasting 1960s

8.3

11.0333

1970s

6.6

10.5000

1980s

16.6

13.6000

1990s

17.6

11.8000

2000s

1.2

10.8000

2010s

13.6

#N/A

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccclxi 16.6 cont.

(b)

Three-period moving average for Stock Performance

(c)

W = 0.50, exponentially smooth the series Decade Performance ES(0.50) 1830s

2.8

2.8000

1840s

12.8

7.8000

1850s

6.6

7.2000

1860s

12.5

9.8500

1870s

7.5

8.6750

1880s

6.0

7.3375

1890s

5.5

6.4188

1900s

10.9

8.6594

1910s

2.2

5.4297

1920s

13.3

9.3648

1930s

-2.2

3.5824

1940s

9.6

6.5912

Copyright ©2024 Pearson Education, Inc.


ccclxii Chapter 16: Time-Series Forecasting 1950s

18.2

12.3956

1960s

8.3

10.3478

1970s

6.6

8.4739

1980s

16.6

12.5370

1990s

17.6

15.0685

2000s

1.2

8.1342

2010s

13.6

10.8671

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccclxiii 16.6 cont.

(c)

W = 0.50, exponentially smooth the series

(d)

Yˆ2020 s  E2010 s  10.8671

(e)

W = 0.25, exponentially smooth the series Decade Performance ES(0.25) 1830s

2.8

2.8000

1840s

12.8

5.3000

1850s

6.6

5.6250

1860s

12.5

7.3438

1870s

7.5

7.3828

1880s

6.0

7.0371

1890s

5.5

6.6528

1900s

10.9

7.7146

1910s

2.2

6.3360

1920s

13.3

8.0770

1930s

-2.2

5.5077

1940s

9.6

6.5308

Copyright ©2024 Pearson Education, Inc.


ccclxiv Chapter 16: Time-Series Forecasting 1950s

18.2

9.4481

1960s

8.3

9.1611

1970s

6.6

8.5208

1980s

16.6

10.5406

1990s

17.6

12.3055

2000s

1.2

9.5291

2010s

13.6

10.5468

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccclxv 16.6 cont.

(e)

W = 0.25, exponentially smooth the series

Yˆ2020 s  E2010 s  10.5468 (f)

(g)

16.7

(a)

The exponentially smoothed forecast for 2020s is lower with a W of 0.50 compared to a W of 0.25. The exponential smoothing with W = 0.50 assigns more weight to the more recent values and is better for short-term forecasting, while the exponential smoothing with W = 0.25, which assigns more weight to more distant values, is better suited for identifying long-term tendencies. Exponential smoothing with a W of 0.25 reveals a general upward trend in the performance of stocks over the last several decades. Time Series Plot: Coffee Exports (thousands 60 kg bag) vs Year

Copyright ©2024 Pearson Education, Inc.


ccclxvi Chapter 16: Time-Series Forecasting

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccclxvii 16.7 cont.

(b)

Three-year moving average for Coffee Exports Year

Exports (thousands 60 kg bag )

MA(3)

2004

10,194

#N/A

2005

10,871

10670.14

2006

10,945

11038.84

2007

11,300

11110.15

2008

11,085

10093.17

2009

7,894

8933.58

2010

7,822

7816.40

2011

7,734

7575.15

2012

7,170

8191.25

2013

9,670

9264.84

2014

10,954

11113.57

2015

12,716

12167.26

2016

12,831

12844.13

2017

12,985

12845.33

2018

12,720

12405.00

2019

11,510

11842.67

2020

11,298

11436.33

2021

11,501

#N/A

Copyright ©2024 Pearson Education, Inc.


ccclxviii Chapter 16: Time-Series Forecasting

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccclxix 16.7 cont.

(c)

W = 0.50, exponentially smooth the series Year

Exports (thousands 60 kg bag )

ES(0.50)

2004

10,194

10194.00

2005

10,871

10532.62

2006

10,945

10738.74

2007

11,300

11019.58

2008

11,085

11052.37

2009

7,894

9473.15

2010

7,822

8647.39

2011

7,734

8190.51

2012

7,170

7680.36

2013

9,670

8675.13

2014

10,954

9814.77

2015

12,716

11265.58

2016

12,831

12048.29

2017

12,985

12516.64

2018

12,720

12618.32

2019

11,510

12064.16

2020

11,298

11681.08

2021

11,501

11591.04

Copyright ©2024 Pearson Education, Inc.


ccclxx Chapter 16: Time-Series Forecasting

(d)

Yˆ2022  E2021  11,591.04 in thousands of 60 kg bags

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccclxxi 16.7 cont.

(e)

W = 0.25, exponentially smooth the series Year

Exports (thousands 60 kg bag )

ES(0.25)

2004

10,194

10194.00

2005

10,871

10363.31

2006

10,945

10508.70

2007

11,300

10706.63

2008

11,085

10801.26

2009

7,894

10074.43

2010

7,822

9511.23

2011

7,734

9066.83

2012

7,170

8592.67

2013

9,670

8861.98

2014

10,954

9385.09

2015

12,716

10217.91

2016

12,831

10871.18

2017

12,985

11399.64

2018

12,720

11729.73

2019

11,510

11674.80

2020

11,298

11580.60

2021

11,501

11560.70

Copyright ©2024 Pearson Education, Inc.


ccclxxii Chapter 16: Time-Series Forecasting

16.8

(a)

Yˆ2022  E2021  11,560.7 in thousands of 60 kg bags The exponentially smoothed 2022 forecast for Costa Rica coffee exports is lower with a W of 0.25 compared to a W of 0.50. The exponential smoothing with W = 0.5 assigns more weight to the more recent values and is better for short-term forecasting, while the exponential smoothing with W = 0.25, which assigns more weight to more distant values, is better suited for identifying long-term tendencies. Exponential smoothing with a W of 0.25 reveals an upward trend from 2004 to 2008 in Costa Rica coffee exports, which was followed by decline in exports from 2009 until 2013. An increase in coffee exports was observed from 2014 to 2021. Time Series Plot: IPOs vs Year

16.8 cont.

(b)

Three-year moving average for IPOs

(f)

(g)

Year

Number of IPOs

MA(3)

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccclxxiii 2000

397

#N/A

2001

141

240.3333

2002

183

157.3333

2003

148

215.0000

2004

314

249.3333

2005

286

273.3333

2006

220

258.0000

2007

268

183.3333

2008

62

136.3333

2009

79

110.3333

2010

190

146.6667

2011

171

172.6667

2012

157

193.0000

2013

251

237.3333

2014

304

253.6667

2015

206

214.3333

2016

133

185.3333

2017

217

201.6667

2018

255

234.6667

2019

232

322.3333

2020

480

#N/A

Copyright ©2024 Pearson Education, Inc.


ccclxxiv Chapter 16: Time-Series Forecasting 16.8 cont.

(b)

Three-year moving average for IPOs

16.8

(c)

W = 0.50, exponentially smooth the series Year

Number of IPOs

ES(0.50)

2000

397

397.0000

2001

141

269.0000

2002

183

226.0000

2003

148

187.0000

2004

314

250.5000

2005

286

268.2500

2006

220

244.1250

2007

268

256.0625

2008

62

159.0313

2009

79

119.0156

2010

190

154.5078

2011

171

162.7539

2012

157

159.8770

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccclxxv 2013

251

205.4385

2014

304

254.7192

2015

206

230.3596

2016

133

181.6798

2017

217

199.3399

2018

255

227.1700

2019

232

229.5850

2020

480

354.7925

Copyright ©2024 Pearson Education, Inc.


ccclxxvi Chapter 16: Time-Series Forecasting 16.8 cont.

(c)

W = 0.50, exponentially smooth the series

(d) (e)

Yˆ2021  E2020  354.7925 W = 0.25, exponentially smooth the series Year

Number of IPOs

ES(0.25)

2000

397

397.0000

2001

141

333.0000

2002

183

295.5000

2003

148

258.6250

2004

314

272.4688

2005

286

275.8516

2006

220

261.8887

2007

268

263.4165

2008

62

213.0624

2009

79

179.5468

2010

190

182.1601

2011

171

179.3701

2012

157

173.7775

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccclxxvii 2013

251

193.0832

2014

304

220.8124

2015

206

217.1093

2016

133

196.0820

2017

217

201.3115

2018

255

214.7336

2019

232

219.0502

2020

480

284.2877

Copyright ©2024 Pearson Education, Inc.


ccclxxviii Chapter 16: Time-Series Forecasting 16.8 cont.

(e)

(f)

(g)

W = 0.25, exponentially smooth the series

Yˆ2021  E2020  284.2877 The exponentially smoothed 2021 forecast for IPOs is lower with a W of 0.25 compared to a W of 0.50. The exponential smoothing with W = 0.5 assigns more weight to the more recent values and is better for short-term forecasting, while the exponential smoothing with W = 0.25, which assigns more weight to more distant values, is better suited for identifying long-term tendencies. There appears to be a cyclical component every several years with up and down cycles of the number of IPOs. The fact that the forecast was so incorrect makes you want to use other approaches for short term forecasting.

16.9

(a) (b) (c) (d)

X=0 X=4 X = 30 X = 35

16.10

(a)

The Y-intercept b0 = 4.0 is the fitted trend value reflecting the real total revenues (in millions of dollars) during the origin or base year 2001. The slope b1 = 1.5 indicates that the real total revenues are increasing at an estimated rate of 1.5 million per year. Year is 2005, X = 2005 – 2001 = 4 Yˆ2002  4.0  1.5(4)  10.0 million dollars Year is 2022, X = 2022 – 2001 = 21, Yˆ2019  4.0  1.5(21)  35.5 million dollars Year is 2025, X = 2025 – 2001 = 24 Yˆ2022  4.0  1.5(24)  40 million dollars

(b) (c) (d) (e)

16.11

(a) (b)

The Y-intercept b0 of 1.2 is the predicted mean sales in $billions during the base year, 2001. The slope b1 of 0.4 indicates that mean sales in $billions are predicted to increase by an estimated rate of $0.4 billion per year. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccclxxix 16.11

(c)

Yˆ  1.7  0.4(9)  5.3 billion dollars

cont.

(d)

Most recent year is 2022, X = 2022 – 2001 = 21, Yˆ  1.7  0.4(21)  10.1 billion dollars Year is 2024, X = 2024 – 2001 = 23 Yˆ  1.7  0.4(23)  10.9 billion dollars

(e) 16.12

(a)

(b)

Bonus($thousands) vs Coded Year Simple Linear Regression Analysis Regression Statistics Multiple R

0.7641

R Square

0.5838

Adjusted R Square

0.5630

Standard Error

30.8973

Observations

22

ANOVA df

SS

MS

F

Regression

1

26780.8753 26780.8753 28.0532

Residual

20

19092.9034

Total

21

45873.7786

Copyright ©2024 Pearson Education, Inc.

954.6452


ccclxxx Chapter 16: Time-Series Forecasting Coefficients

Standard Error

t Stat

P-value

Intercept

89.1332

12.7378

6.9975

0.0000

Coded Year

5.4994

1.0383

5.2965

0.0000

Linear model: Predicted Bonus = Yˆ  89.1332  5.4994 X where X = years relative to 2000 t = 5.2965, p-value = 0.0000 < 0.05; coded year is significant; r2 = 0.5838, 58.38% of the variation in bonuses is explained by year.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccclxxxi 16.12 cont.

(b)

Fitted Line Plot of Bonus($thousands) vs Coded Year

(c)

Regression Analysis: Bonus($thousands) vs Coded Year, Coded Year Sq Regression Statistics Multiple R

0.7688

R Square

0.5911

Adjusted R Square

0.5481

Standard Error

31.4203

Observations

22

ANOVA df

SS

MS

F

Regression

2

27116.3539 13558.1770 13.7335

Residual

19

18757.4247

Total

21

45873.7786

Copyright ©2024 Pearson Education, Inc.

987.2329


ccclxxxii Chapter 16: Time-Series Forecasting Coefficients

Standard Error

t Stat

P-value

Intercept

96.7498

18.3986

5.2585

0.0000

Coded Year

3.2145

4.0595

0.7918

0.4382

Coded Year Sq

0.1088

0.1867

0.5829

0.5668

Quadratic model:

(c)

Predicted Bonus = Yˆ  96.7498  3.2145 X  0.1088 X where X = years relative to 2000 For full model: F = 13.7335, p-value = 0.0000, at least one X variable is significant. For Coded Year^2: t = 0.5829, p-value = 0.5668 > 0.05. Coded Year^2 is not significant; r2 = 0.5911. 59.11% of the variation in bonuses is explained by year. Fitted Line Plot of Bonus($thousands) vs Coded Year, Coded Year Sq

(d)

log(Bonus($thousands)) vs Coded Year

2

16.12 cont.

Simple Linear Regression Analysis Regression Statistics Multiple R

0.7616

R Square

0.5800

Adjusted R Square

0.5590

Standard Error

0.0995

Observations

22

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccclxxxiii ANOVA df

SS

MS

F

Regression

1

0.2732

0.2732 27.6224

Residual

20

0.1978

0.0099

Total

21

0.4711

Coefficients

Standard Error

t Stat

P-value

Intercept

1.9595

0.0410 47.7890

0.0000

Coded Year

0.0176

0.0033

0.0000

5.2557

Exponential model: log (predicted bonus) = log10 Yˆ  1.9595  0.0176( X ) where X = years relative to 2000 t = 5.2557, p-value = 0.0000 < 0.05; Coded Year is significant; r2 = 0.5800. 58% of the variation in the log of bonuses is explained by year.

Copyright ©2024 Pearson Education, Inc.


ccclxxxiv Chapter 16: Time-Series Forecasting 16.12 cont.

(d)

Fitted Line Plot of log(Bonus($thousands)) vs Coded Year

(e)

Linear: Yˆ2022  89.1332  5.4994(22)  210.1208 Yˆ2023  89.1332  5.4994(23)  215.6202 Quadratic: Yˆ  96.7498  3.2145 X  0.1088 X Yˆ2022  96.7498  3.2145(22)  0.1088(22)2  220.1312 Yˆ2023  96.7498  3.2145(23)  0.1088(23)2  228.2420 2

Exponential: log10 Yˆ2022  1.9595  0.0176(22)  2.3460 Yˆ2022  102.3460  221.7999 log10 Yˆ2023  1.9595  0.0176(23)  2.3635 Yˆ  102.3635  230.9551 2023

(f)

Because the r2 values are similar, you would evaluate the residual plots to see if there were any patterns before choosing. Based on the principle of parsimony, one might choose the linear model.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccclxxxv 16.13

(a)

(b)

GDP vs Coded Year Simple Linear Regression Analysis

Regression Statistics Multiple R

0.9912

R Square

0.9825

Adjusted R Square

0.9821

Standard Error

787.0082

Observations

42

ANOVA df

SS

MS

F

Regression

1 1390018070.5883 1390018070.5883 2244.2020

Residual

40

Total

41 1414793346.5698

Coefficients

24775275.9815

Standard Error

Copyright ©2024 Pearson Education, Inc.

619381.8995

t Stat

P-value


ccclxxxvi Chapter 16: Time-Series Forecasting Intercept

1389.3978

238.6022

5.8231

0.0000

Coded Year

474.6244

10.0189

47.3730

0.0000

Yˆ  1,389.3978  474.6244 X where X = years relative to 1980.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccclxxxvii 16.13 cont.

(b)

Linear Model: Fitted Line Plot GDP = 1,389.3978 + 474.6244 Coded Year

(c)

Yˆ2022  1,389.3978  474.6244(42)  21,323.6218 billion dollars. Yˆ  1,389.3978  474.6244(43)  21,798.2462 billion dollars.

(d)

Based on the regression equation one would conclude that there is an upward linear trend in GDP growth.

2023

16.14

(a)

Copyright ©2024 Pearson Education, Inc.


ccclxxxviii Chapter 16: Time-Series Forecasting 16.14 cont.

(b)

Receipts vs Coded Year Simple Linear Regression Analysis Regression Statistics Multiple R

0.9824

R Square

0.9651

Adjusted R Square

0.9642

Standard Error

177.2781

Observations

44

ANOVA df

SS

MS

F

Regression

1 36448964.8010 36448964.8010 1159.7783

Residual

42

Total

43 37768920.9591

1319956.1581

31427.5276

Coefficients

Standard Error

t Stat

P-value

Intercept

238.9876

52.5530

4.5476

0.0000

Coded Year

71.6748

2.1046

34.0555

0.0000

Yˆ  238.9876  71.6748( X ) where X = years relative to 1978 t = 34.0555, p-value = 0.0000 < 0.05. Coded year is significant. r2 = 0.9651, 96.51% of the variation in federal receipts is explained by year.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccclxxxix

(c)

Yˆ2022  238.9876  71.6748(44)  3,392.678 billion Yˆ  238.9876  71.6748(45)  3, 464.353 billion

(d)

There is strong upward trend in federal receipts from 1978 through 2021, which appears to be linear.

2023

16.15

(a)

(b)

Total Sales (thousands) vs Coded Year Simple Linear Regression Analysis Regression Statistics

Copyright ©2024 Pearson Education, Inc.


cccxc Chapter 16: Time-Series Forecasting Multiple R

0.3960

R Square

0.1568

Adjusted R Square

0.1267

Standard Error

239.2122

Observations

30

ANOVA df

SS

MS

Regression

1

298008.6318 298008.6318

Residual

28

1602229.2349

Total

29

1900237.8667

Coefficients

Standard Error

Intercept

868.7011

Coded Year

-11.5150

F 5.2079

57222.4727

t Stat

P-value

85.2085

10.1950

0.0000

5.0458

-2.2821

0.0303

Linear Model: Predicted House Sales = Yˆ  868.7011  11.515 X where X = years relative to 1992. t = –2.2821, p-value = 0.0303 < 0.05. Coded year is significant. r2 = 0.1568. 15.68% of the variation in house sales is explained by year.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccxci 16.15 cont.

(b)

Linear Model: Fitted Line Plot Total Sales (thousands) = 868.7011 – 11.515 Coded Year

(c)

Total Sales (thousands) vs Coded Year, Coded Year Sq Regression Analysis Regression Statistics Multiple R

0.4392

R Square

0.1929

Adjusted R Square

0.1331

Standard Error

238.3291

Observations

30

ANOVA df

SS

MS

Regression

2

366617.2501 183308.6251

Residual

27

1533620.6165

Total

29

1900237.8667

Copyright ©2024 Pearson Education, Inc.

56800.7636

F 3.2272


cccxcii Chapter 16: Time-Series Forecasting

16.15 cont.

(c)

(d)

Coefficients

Standard Error

t Stat

P-value

Intercept

771.9544

122.2948

6.3122

0.0000

Coded Year

9.2164

19.5217

0.4721

0.6406

Coded Year Sq

-0.7149

0.6505

-1.0990

0.2815

Quadratic Model: Predicted House Sales = Yˆ  771.9544  9.2164 X  0.7149 X 2 where X = years relative to 1992. For full model, F = 3.2272, p-value = 0.0000, at least one X variable is significant. For Coded Year2, t = –1.0990, p-value = 0.2815 > 0.05. Coded Year2 not is significant. r2 = 0.1929. 19.29% of the variation in house sales is explained by year. Quadratic Model: Fitted Line Plot Total sales (thousands) = 771.9544 + 9.2164 Coded Year – 0.7149 Coded Year2

log(Total Sales(thousands)) vs Coded Year Simple Linear Regression Analysis

Regression Statistics Multiple R

0.4133

R Square

0.1708

Adjusted R Square

0.1412

Standard Error

0.1531

Observations

30 Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccxciii

ANOVA df

SS

MS

F 5.7694

Regression

1

0.1352

0.1352

Residual

28

0.6561

0.0234

Total

29

0.7913

Coefficients

Standard Error

t Stat

P-value

Intercept

2.9295

0.0545 53.7250

0.0000

Coded Year

-0.0078

0.0032

0.0232

-2.4020

Exponential model: log (predicted house sales) = log10 Yˆ  2.9295  0.0078( X ) where X = years relative to 1992 t = –2.4020, p-value = 0.0232 < 0.05; Coded Year is significant; r2 = 0.1708. 17.08% of the variation in the log of house sales is explained by year.

Copyright ©2024 Pearson Education, Inc.


cccxciv Chapter 16: Time-Series Forecasting 16.15 cont.

(d)

Fitted Line Plot of log(House Sales(thousands)) vs Coded Year

(e) First

Year

Second

Percentage

Total Sales Difference Difference Difference 1992 610 #N/A #N/A #N/A 1993 666 56 #N/A 9.18% 1994 670 4 -52 0.60% 1995 667 -3 -7 -0.45% 1996 757 90 93 13.49% 1997 804 47 -43 6.21% 1998 886 82 35 10.20% 1999 880 -6 -88 -0.68% 2000 877 -3 3 -0.34% 2001 908 31 34 3.53% 2002 973 65 34 7.16% 2003 1,086 113 48 11.61% 2004 1,203 117 4 10.77% 2005 1,283 80 -37 6.65% 2006 1,051 -232 -312 -18.08% 2007 776 -275 -43 -26.17% 2008 485 -291 -16 -37.50% 2009 375 -110 181 -22.68% 2010 323 -52 58 -13.87% 2011 306 -17 35 -5.26% 2012 368 62 79 20.26% 2013 429 61 -1 16.58% 2014 437 8 -53 1.86% 2015 501 64 56 14.65% 2016 561 60 -4 11.98% 2017 613 52 -8 9.27% 2018 617 4 -48 0.65% 2019 600 -17 -21 -2.76% 2020 650 50 67 8.33% 2021 690 40 -10 6.15%

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccxcv

Copyright ©2024 Pearson Education, Inc.


cccxcvi Chapter 16: Time-Series Forecasting 16.15 cont.

(e)

(f) 16.16

A review of 1st, 2nd, and percentage differences reveals no particular model is more appropriate than the other. Based on the principle of parsimony, one might choose the linear model. However, none of the models fit the data well because of an irregular component that occurred during the time series. Other models covered in Chapter 16 may be more appropriate. Yˆ  868.7011  11.515(30)  523.2506 (thousands)

(a)

(b)

Linear: Solar Power Generated (gigawatts) vs Coded Year Simple Linear Regression Analysis Regression Statistics Multiple R

0.9700

R Square

0.9409

Adjusted R Square

0.9350

Standard Error

9641.9264

Observations

12

ANOVA df

SS

MS

F

Regression

1 14791878580.3916 14791878580.3916 159.1094

Residual

10

929667449.2751

Copyright ©2024 Pearson Education, Inc.

92966744.9275


Solutions to End-of-Section and Chapter Review Problems cccxcvii Total

11 15721546029.6667 Coefficients

16.16 cont.

Standard Error

t Stat

P-value

Intercept

-14810.0897

5235.7684

-2.8286

0.0179

Coded Year

10170.5315

806.2984

12.6139

0.0000

(b)

Predicted solar power generated = Yˆ  14,810.0897  10,170.5315 X , where X = years relative to 2010. t = 12.6139, p-value = 0.0000 < 0.05; coded year is significant; r2 = 0.9409, 94.09% of the variation in solar power generated is explained by year. Fitted Line Plot of Solar Power Generated (gigawatts) vs Coded Year

(c)

Quadratic Regression Analysis: Solar Power Generated vs Coded Year, Coded Year Sq Regression Statistics Multiple R

0.9968

R Square

0.9936

Adjusted R Square

0.9922

Standard Error

3340.8379

Observations

12

Copyright ©2024 Pearson Education, Inc.


cccxcviii Chapter 16: Time-Series Forecasting ANOVA df

SS

MS

F

Regression

2 15621095247.8205 7810547623.9103 699.7947

Residual

9

Total

11 15721546029.6667 Coefficients

100450781.8462

Standard Error

11161197.9829

t Stat

P-value

Intercept

-359.3846

2470.1951

-0.1455

0.8875

Coded Year

1500.1084

1043.9910

1.4369

0.1846

Coded Year Sq

788.2203

91.4469

8.6194

0.0000

Predicted solar power generated = Yˆ  359.3846  1,500.1084 X  788.2203X 2 , where X = years relative to 2010. For full model: F = 699.7947, p-value = 0.0002, at least one X variable is significant. For Coded Year^2: t = 0.8.6194, p-value = 0.0000 < 0.05. Coded Year^2 is significant; r2 = 0.9936. 99.36% of the variation in solar power generated is explained by year.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccxcix 16.16 cont.

(c)

(d)

Exponential: log(Solar Power(gigawatts)) vs Coded Year Simple Linear Regression Analysis

Regression Statistics Multiple R

0.9628

R Square

0.9270

Adjusted R Square

0.9197

Standard Error

0.1904

Observations

12

ANOVA df

SS

MS

F

Regression

1

4.6026

4.6026 127.0262

Residual

10

0.3623

0.0362

Total

11

4.9649

Copyright ©2024 Pearson Education, Inc.


cd Chapter 16: Time-Series Forecasting Coefficients Standard Error

t Stat

P-value

Intercept

3.3142

0.1034 32.0632

0.0000

Coded Year

0.1794

0.0159 11.2706

0.0000

Log(predicted solar power generated) = log10 Yˆ  3.3142  0.1794( X ) , where X = years relative to 2010 t = 11.2706, p-value = 0.0000 < 0.05; Coded Year is significant; r2 = 0.9270. 92.70% of the variation in the log of solar power generated is explained by year.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdi 16.16 cont.

(d)

Fitted Line Plot of log(Solar Power Generated (gigawatts)) vs Coded Year

(e)

Linear: Yˆ2022  14,810.0897  10,170.5315(12)  107, 236.29 gigawatts Yˆ2023  14,810.0897  10,170.5315(13)  117, 406.82 gigawatts Quadratic: Yˆ2022  359.3846  1500.1084(12)  788.2203(12)2  131,145.64 gigawatts Yˆ2023  359.3846  1500.1084(13)  788.2203(13) 2  152,351.25 gigawatts Exponential: log10 Yˆ2022  3.3142  0.1794(12)  5.4670 Yˆ2022  105.4670  293,109.51 gigawatts log10 Yˆ2023  3.3142  0.1794(13)  5.6464 Yˆ  105.6464  443,030.78 gigawatts 2023

16.17

(a)

Time Series Plot of Auto Production

Copyright ©2024 Pearson Education, Inc.


cdii Chapter 16: Time-Series Forecasting

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdiii 16.17 cont.

(b)

Linear: Units Produced (thousands) vs Coded Year Simple Linear Regression Analysis Regression Statistics Multiple R

0.7781

R Square

0.6055

Adjusted R Square

0.5867

Standard Error

702.1920

Observations

23

ANOVA df

SS

MS

F

Regression

1 15892756.8289 15892756.8289 32.2320

Residual

21 10354544.5846

Total

22 26247301.4135

493073.5516

Coefficients

Standard Error

t Stat

P-value

Intercept

5123.8344

283.5356

18.0712

0.0000

Coded Year

-125.3168

22.0732

-5.6773

0.0000

Predicted Production = Yˆ  5,123.8344  125.3168( X ) where X = years relative to 1999. t = –5.6773, p-value = 0.0000 < 0.05; coded year is significant; r2 = 0.6055, 60.55% of the variation in units produced is explained by year.

Copyright ©2024 Pearson Education, Inc.


cdiv Chapter 16: Time-Series Forecasting

16.17 cont.

(c)

Regression Analysis Unit Production (thousands) vs Coded Year, Coded Year Sq Regression Statistics Multiple R

0.7784

R Square

0.6058

Adjusted R Square

0.5664

Standard Error

719.2189

Observations

23

ANOVA df

SS

MS

F

Regression

2 15901785.1296 7950892.5648 15.3707

Residual

20 10345516.2839

Total

22 26247301.4135

517275.8142

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdv Coefficients

Standard Error

t Stat

P-value

Intercept

5162.7093

413.4319

12.4874

0.0000

Coded Year

-136.4239

87.0604

-1.5670

0.1328

Coded Year Sq

0.5049

3.8215

0.1321

0.8962

Yˆ  5542  296.7( X )  10.36 X 2 where X = years relative to 1999. Quadratic model: Predicted Unit Production = Yˆ  5162.7093  136.4239 X  0.5049 X 2 where X = years relative to 1999 For full model: F = 15.3707, p-value = 0.0000, at least one X variable is significant. For Coded Year^2: t = 0.1321, p-value = 0.8962 > 0.05. Coded Year^2 is not significant; r2 = 0.6058. 60.58% of the variation in unit production is explained by year.

Fitted Line Plot of United Produced (thousands) vs Coded Year, Coded Year Sq

Copyright ©2024 Pearson Education, Inc.


cdvi Chapter 16: Time-Series Forecasting 16.17 cont.

(d)

Exponential log(Units Produced) vs Coded Year Simple Linear Regression Analysis Regression Statistics Multiple R

0.7461

R Square

0.5567

Adjusted R Square

0.5355

Standard Error

0.0991

Observations

23

ANOVA df

SS

MS

F

Regression

1

0.2589

0.2589 26.3676

Residual

21

0.2062

0.0098

Total

22

0.4652

Coefficients Standard Error

t Stat

P-value

Intercept

3.7284

0.0400 93.1778

0.0000

Coded Year

-0.0160

0.0031

0.0000

-5.1349

Exponential model: log (predicted units produced) = log10 Yˆ  3.7284  0.0160( X ) where X = years relative to 1999. t = –5.1349, p-value = 0.0000 < 0.05; Coded Year is significant; r2 = 0.5567. 55.67% of the variation in the log of units produced is explained by year. Fitted Line Plot of log(Units Produced(thousands)) vs Coded Year

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdvii

Copyright ©2024 Pearson Education, Inc.


cdviii Chapter 16: Time-Series Forecasting 16.17 cont.

(e) First

Second

Percentage

Year Units Produced Difference Difference Difference 1999 5,577.749 #N/A #N/A #N/A 2000 5,470.917 -107 #N/A -1.92% 2001 4,808.019 -663 -556 -12.12% 2002 4,957.377 149 812 3.11% 2003 4,453.369 -504 -653 -10.17% 2004 4,165.925 -287 217 -6.45% 2005 4,265.872 100 387 2.40% 2006 4,311.696 46 -54 1.07% 2007 3,867.268 -444 -490 -10.31% 2008 3,731.383 -136 309 -3.51% 2009 2,196.446 -1,535 -1,399 -41.14% 2010 2,731.759 535 2,070 24.37% 2011 2,977.711 246 -289 9.00% 2012 4,109.013 1,131 885 37.99% 2013 4,368.835 260 -871 6.32% 2014 4,253.098 -116 -376 -2.65% 2015 4,162.808 -90 25 -2.12% 2016 3,916.584 -246 -156 -5.91% 2017 3,033.216 -883 -637 -22.55% 2018 2,785.164 -248 635 -8.18% 2019 2,511.711 -273 -25 -9.82% 2020 1,924.398 -587 -314 -23.38% 2021 1,562.717 -362 226 -18.79%

(f)

A review of 1st, 2nd, and percentage differences reveals no particular model is more appropriate than the other. Based on the principle of parsimony, one might choose the linear model. However, none of the models fit the data well because of an irregular component that occurred during the time series. Other models covered in Chapter 16 may be more appropriate. Linear: Yˆ2022  5123.9344  125.3168(23)  2241.5475 units (thousands) Quadratic: Yˆ2023  5162.7093  136.4239(23)  0.5049(23) 2  2292.03 units (thousands) Exponential: log10 Yˆ2022  3.7284  0.160(23)  3.3605 Yˆ2022  103.3605  2293.46 units (thousands)

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdix 16.18

(a)

Time Series Plot of MLB Salaries (in $millions)

(b)

Linear: Salary ($millions) vs Coded Year Simple Linear Regression Analysis

Regression Statistics Multiple R

0.9643

R Square

0.9298

Adjusted R Square

0.9259

Standard Error

0.2118

Observations

20

ANOVA df

SS

MS

F

Regression

1

10.6991 10.6991 238.4980

Residual

18

0.8075

Total

19

11.5066

Coefficients

Standard Error

Copyright ©2024 Pearson Education, Inc.

0.0449

t Stat

P-value


cdx Chapter 16: Time-Series Forecasting Intercept

2.2680

0.0913 24.8478

0.0000

Coded Year

0.1268

0.0082 15.4434

0.0000

Predicted MLB Salary = Yˆ  2.2680  0.1268 X where X = years relative to 2003 t = 15.4434, p-value = 0.0000 < 0.05; coded year is significant; r2 = 0.9298, 92.98% of the variation in MLB salaries is explained by year.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdxi 16.18 cont.

(b)

Fitted Line Plot of MLB Salaries ($millions) vs Coded Year

(c)

Regression Analysis MLB Salary ($mil) vs Coded Year, Coded Year Sq Regression Statistics Multiple R

0.9658

R Square

0.9328

Adjusted R Square

0.9249

Standard Error

0.2133

Observations

20

ANOVA df

SS

MS

F

Regression

2

10.7333

5.3666 117.9703

Residual

17

0.7734

0.0455

Total

19

11.5066

Copyright ©2024 Pearson Education, Inc.


cdxii Chapter 16: Time-Series Forecasting Coefficients

Standard Error

t Stat

P-value

Intercept

2.1885

0.1299 16.8511

0.0000

Coded Year

0.1533

0.0317

4.8396

0.0002

Coded Year Sq

-0.0014

0.0016

-0.8662

0.3984

Quadratic model:

(c)

Predicted Bonus = Yˆ  2.1885  0.1533 X  0.0014 X where X = years relative to 2003 For full model: F = 117.9703, p-value = 0.0000, at least one X variable is significant. For Coded Year^2: t = –0.8662, p-value = 0.3984 > 0.05. Coded Year^2 is not significant; r2 = 0.9328. 93.28% of the variation in MLB salaries is explained by year. Fitted Line Plot of MLB Salary ($mil) vs Coded Year, Coded Year Sq

(d)

log(MLB Salary ($millions)) vs Coded Year

2

16.18 cont.

Simple Linear Regression Analysis

Regression Statistics Multiple R

0.9681

R Square

0.9371

Adjusted R Square

0.9336

Standard Error

0.0257

Observations

20 Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdxiii

ANOVA df

SS

MS

F

Regression

1

0.1778

0.1778 268.2773

Residual

18

0.0119

0.0007

Total

19

0.1897

Coefficients Standard Error

t Stat

P-value

Intercept

0.3746

0.0111 33.7678

0.0000

Coded Year

0.0164

0.0010 16.3792

0.0000

Exponential model: log (predicted Salary) = log10 Yˆ  0.3746  0.0164 X where X = years relative to 2003 t = 16.3792, p-value = 0.0000 < 0.05; Coded Year is significant; r2 = 0.9371. 93.71% of the variation in the log of MLB Salaries is explained by year.

Copyright ©2024 Pearson Education, Inc.


cdxiv Chapter 16: Time-Series Forecasting 16.18 cont.

(d)

Fitted Line Plot of log(MLB Salary ($millions)) vs Coded Year

(e) First Second Percentage Year Salary Difference Difference Difference 2003 2.37 #N/A #N/A #N/A 2004 2.31 -107 #N/A -1.92% 2005 2.46 -663 -556 -12.12% 2006 2.70 149 812 3.11% 2007 2.82 -504 -653 -10.17% 2008 2.93 -287 217 -6.45% 2009 3.00 100 387 2.40% 2010 3.01 46 -54 1.07% 2011 3.10 -444 -490 -10.31% 2012 3.21 -136 309 -3.51% 2013 3.39 -1,535 -1,399 -41.14% 2014 3.69 535 2,070 24.37% 2015 3.84 246 -289 9.00% 2016 4.38 1,131 885 37.99% 2017 4.45 260 -871 6.32% 2018 4.41 -116 -376 -2.65% 2019 4.38 -90 25 -2.12% 2020 4.43 -246 -156 -5.91% 2021 4.17 -883 -637 -22.55% 2022 4.41 -248 635 -8.18%

The first and second differences are relatively consistent across the series. This is not the case for percentage differences. Based on the principle of parsimony, one might choose the linear model. Because the r2 value is similar between the linear model and the exponential model and the linear model is simpler, that model is chosen. (f)

Linear forecast: Yˆ2023  2.2680  0.1268(20)  4.8048 $million Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdxv

Copyright ©2024 Pearson Education, Inc.


cdxvi Chapter 16: Time-Series Forecasting 16.19

(a)

Time Series Plot of Silver (US$/ounce)

(b)

Silver Price (US$/ounce) vs Coded Year Simple Linear Regression Analysis

Regression Statistics Multiple R

0.7073

R Square

0.5003

Adjusted R Square

0.4765

Standard Error

6.0311

Observations

23

ANOVA df

SS

MS

F

Regression

1

764.6610 764.6610 21.0218

Residual

21

763.8697

Total

22

1528.5307

Copyright ©2024 Pearson Education, Inc.

36.3747


Solutions to End-of-Section and Chapter Review Problems cdxvii Coefficients Standard Error

t Stat

P-value

Intercept

5.7811

2.4353

2.3739

0.0272

Coded Year

0.8692

0.1896

4.5849

0.0002

Yˆ  5.59  0.899( X ) where X = years relative to 1999. Predicted Bonus = Yˆ  5.7811  0.8692( X ) where X = years relative to 1999

t = 4.5849, p-value = 0.0002 < 0.05; coded year is significant; r2 = 0.5003, 50.03% of the variation in Silve Price is explained by year.

Copyright ©2024 Pearson Education, Inc.


cdxviii Chapter 16: Time-Series Forecasting 16.19 cont.

(b)

Fitted Line Plot of Silver Price (US$/ounce) vs Coded Year

(c)

Regression Analysis: Silver Price (US$/ounce) vs Coded Year, Coded Year Sq Regression Statistics Multiple R

0.7657

R Square

0.5864

Adjusted R Square

0.5450

Standard Error

5.6225

Observations

23

ANOVA df

SS

MS

F

Regression

2

896.2734 448.1367 14.1758

Residual

20

632.2573

Total

22

1528.5307

Copyright ©2024 Pearson Education, Inc.

31.6129


Solutions to End-of-Section and Chapter Review Problems cdxix Coefficients

Standard Error

t Stat

P-value

Intercept

1.0874

3.2320

0.3364

0.7400

Coded Year

2.2103

0.6806

3.2476

0.0040

Coded Year^2

-0.0610

0.0299

-2.0404

0.0547

Quadratic model: Predicted Bonus = Yˆ  1.0874  2.2103 X  0.0610 X where X = years relative to 1999. For full model: F = 14.1758, p-value = 0.0000, at least one X variable is significant. For Coded Year^2: t = –2.0404, p-value = 0.0547 > 0.05. Coded Year^2 is not significant; r2 = 0.5864. 58.64% of the variation in Silver Price is explained by year. Fitted Line Plot of Silver Price (US$/ounce) vs Coded Year, Coded Year Sq 2

(c)

Fitted Line Plot Price = 1.0874 + 2.2103X - 0.0610X^2 35.000 30.000

Price (US$/ounce)

16.19 cont.

25.000 y = -0.061x2 + 2.2103x + 1.0874

20.000 15.000 10.000 5.000 0.000 0

5

10

15

Coded Year

(d)

log(Silver Price(US$/ounce)) vs Coded Year Simple Linear Regression Analysis

Regression Statistics Multiple R

0.8048

R Square

0.6477

Adjusted R Square

0.6309

Standard Error

0.1667

Observations

23 Copyright ©2024 Pearson Education, Inc.

20

25


cdxx Chapter 16: Time-Series Forecasting

ANOVA df

SS

MS

F

Regression

1

1.0733

1.0733 38.6085

Residual

21

0.5838

0.0278

Total

22

1.6570

Coefficients

Standard Error

t Stat

P-value

Intercept

0.7540

0.0673 11.1991

0.0000

Coded Year

0.0326

0.0052

0.0000

6.2136

log 10Yˆ  0.7297  0.03622( X ) where X = years relative to 1999. Exponential model: log (predicted bonus) = log 10Yˆ  0.7540  0.0326( X ) where X = years relative to 1999 t = 6.2136, p-value = 0.0000 < 0.05; Coded Year is significant; r2 = 0.6477. 64.77% of the variation in the log of Silver Price is explained by year.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdxxi 16.19 cont.

(d)

Fitted Line Plot of log(Silver Price(US$/ounce)) vs Coded Year

(e) First

Year 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021

(f)

Second

Percentage

Price Difference Difference Difference 5.330 #N/A #N/A #N/A 4.570 -1 #N/A -14.26% 4.520 0 1 -1.09% 4.670 0 0 3.32% 5.965 1 1 27.73% 6.815 1 0 14.25% 8.830 2 1 29.57% 12.900 4 2 46.09% 14.760 2 -2 14.42% 10.790 -4 -6 -26.90% 16.990 6 10 57.46% 30.630 14 7 80.28% 28.180 -2 -16 -8.00% 29.950 2 4 6.28% 19.500 -10 -12 -34.89% 15.970 -4 7 -18.10% 13.820 -2 1 -13.46% 15.990 2 4 15.70% 16.865 1 -1 5.47% 15.490 -1 -2 -8.15% 20.770 5 7 34.09% 26.490 6 0 27.54% 23.090 -3 -9 -12.84%

Neither the first differences, second differences, nor percentage differences are constant across years. Based on the principle of parsimony, one might choose the linear model. However, none of the models fit the data well because of an irregular component that occurred during the time series. Other models covered in Chapter 16 may be more appropriate. Linear: Yˆ2022  5.7811  0.8692(23)  25.77 (US$/ounce) Copyright ©2024 Pearson Education, Inc.


cdxxii Chapter 16: Time-Series Forecasting 16.20

(a)

Time Series Plot of CPI-U (consumer price index for all urban consumers)

(b)

There has been an upward trend in the CPI-U in the United States from 1965 through 2021.

(c)

Linear: CPI-U (consumer price index for all urban consumers) vs Coded Year Simple Linear Regression Analysis

Regression Statistics Multiple R

0.9979

R Square

0.9957

Adjusted R Square

0.9956

Standard Error

4.9277

Observations

57

ANOVA df

SS

Regression

1

Residual

55

1335.5424

Total

56

311277.0656

MS

F

309941.5232 309941.5232 12763.9405

Copyright ©2024 Pearson Education, Inc.

24.2826


Solutions to End-of-Section and Chapter Review Problems cdxxiii

Coefficients

Standard Error

t Stat

P-value

Intercept

16.3914

1.2884

12.7223

0.0000

Coded Year

4.4821

0.0397

112.9776

0.0000

Linear: Predicted CPI-U = Yˆ  16.3914  4.4821X where X = years relative to 1965. t = 112.9776, p-value = 0.0000 < 0.05; coded year is significant; r2 = 0.9957, 99.57% of the variation in CPI-U is explained by year.

Copyright ©2024 Pearson Education, Inc.


cdxxiv Chapter 16: Time-Series Forecasting 16.20 cont.

(c)

Fitted Line Plot of CPI-U vs Coded Year

(d)

Regression Analysis: CPI-U vs Coded Year, Coded Year Sq Regression Statistics Multiple R

0.9979

R Square

0.9959

Adjusted R Square

0.9957

Standard Error

4.8759

Observations

57

ANOVA df

SS

Regression

2

Residual

54

1283.8426

Total

56

311277.0656

MS

F

309993.2230 154996.6115 6519.3482

Copyright ©2024 Pearson Education, Inc.

23.7749


Solutions to End-of-Section and Chapter Review Problems cdxxv Coefficients

Standard Error

t Stat

P-value

Intercept

18.4118

1.8715

9.8382

0.0000

Coded Year

4.2617

0.1545

27.5785

0.0000

Coded Year Sq

0.0039

0.0027

1.4746

0.1461

Quadratic model:

(d)

Predicted CPI-U = Yˆ  18.4118  4.2617 X  0.0039 X where X = years relative to 1965. For full model: F = 6519.3482, p-value = 0.0000, at least one X variable is significant. For Coded Year^2: t = 1.4746, p-value = 0.1461 > 0.05. Coded Year^2 is not significant; r2 = 0.9959. 99.59% of the variation in CPI-U is explained by year. Fitted Line Plot of CPI-U vs Coded Year, Coded Year Sq

(e)

log(CPI-U) vs Coded Year

2

16.20 cont.

Simple Linear Regression Analysis

Regression Statistics Multiple R

0.9664

R Square

0.9338

Adjusted R Square

0.9326

Standard Error

0.0751

Observations

57 Copyright ©2024 Pearson Education, Inc.


cdxxvi Chapter 16: Time-Series Forecasting

ANOVA df

SS

MS

F

Regression

1

4.3820

4.3820 776.3896

Residual

55

0.3104

0.0056

Total

56

4.6924

Coefficients Standard Error

t Stat

P-value

Intercept

1.6004

0.0196 81.4767

0.0000

Coded Year

0.0169

0.0006 27.8638

0.0000

Exponential model: log (CPI-U) = log10 Yˆ  1.6004  0.0169( X ) where X = years relative to 1965 t = 27.8638, p-value = 0.0000 < 0.05; Coded Year is significant; r2 = 0.9338. 93.38% of the variation in the log of CPI-U is explained by year.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdxxvii 16.20 cont.

(e)

Fitted Line of log(CPI-U) vs Coded Year

(f)

Because the quadratic term was not significant and the r2 value is lower for the linear model, that model is chosen. A review of the first, second, and percentage differences revealed similar levels of variation across the time series. Based on the principle of parsimony, one might choose the linear model. Linear forecast: Yˆ2022  16.3914  4.4821(57)  271.8732 Yˆ2023  16.3914  4.4821(58)  276.3553

(g)

16.21

(a)

Time Series I: Year

Series I

First Second Percentage Difference Difference Difference

2010

10.0

#N/A

#N/A

#N/A

2011

15.1

5.1

#N/A

51.00%

2012

24.0

8.9

3.8

58.94%

2013

36.7

12.7

3.8

52.92%

2014

53.8

17.1

4.4

46.59%

2015

74.8

21.0

3.9

39.03%

2016

100.0

25.2

4.2

33.69%

2017

129.2

29.2

4.0

29.20%

Copyright ©2024 Pearson Education, Inc.


cdxxviii Chapter 16: Time-Series Forecasting 2018

162.4

33.2

4.0

25.70%

2019

199.0

36.6

3.4

22.54%

2020

239.3

40.3

3.7

20.25%

2021

283.5

44.2

3.9

18.47%

For Time Series I, second differences are the most stable with values staying near 4. The quadratic model appears to be the most appropriate model.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdxxix 16.21 cont.

(a)

Time Series II:

Year

Series II

First Second Percentage Difference Difference Difference

2010

30.0

2011

33.1

3.1

2012

36.4

3.3

0.2

9.97%

2013

39.9

3.5

0.2

9.62%

2014

43.9

4.0

0.5

10.03%

2015

48.2

4.3

0.3

9.79%

2016

53.2

5.0

0.7

10.37%

2017

58.2

5.0

0.0

9.40%

2018

64.5

6.3

1.3

10.82%

2019

70.7

6.2

-0.1

9.61%

2020

77.1

6.4

0.2

9.05%

2021

83.9

6.8

0.4

8.82%

#N/A

#N/A

#N/A

#N/A

10.33%

For Time Series II, the second differences and percentage differences both appear to be stable. The quadratic model and the exponential appear to be appropriate models. Time Series 3: Year

Series III

First Second Difference Difference

2010

60.0

2011

67.9

7.9

2012

76.1

8.2

0.3

12.08%

2013

84.0

7.9

-0.3

10.38%

2014

92.2

8.2

0.3

9.76%

2015

100.0

7.8

-0.4

8.46%

#N/A

Percentage Difference

#N/A

#N/A

#N/A

13.17%

Copyright ©2024 Pearson Education, Inc.


cdxxx Chapter 16: Time-Series Forecasting 2016

108.0

8.0

0.2

8.00%

2017

115.8

7.8

-0.2

7.22%

2018

124.1

8.3

0.5

7.17%

2019

132.0

7.9

-0.4

6.37%

2020

140

8.0

0.1

6.06%

2021

147.8

7.8

-0.2

5.57%

For Time Series 3, the first differences are slightly more stable than second differences. The linear model appears to be the most appropriate model.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdxxxi 16.21 cont.

(b)

Time Series I: Series I vs Coded Year, Coded Year Sq Regression Analysis Regression Statistics Multiple R

1.0000

R Square

1.0000

Adjusted R Square

1.0000

Standard Error

0.3931

Observations

12

ANOVA df

SS

Regression

2

Residual

9

1.3911

Total

11

94123.4500

MS

F

94122.0589 47061.0295 304477.5537 0.1546

Coefficients Standard Error

t Stat

P-value

Intercept

9.7560

0.2907

33.5618

0.0000

Coded Year

3.1876

0.1229

25.9456

0.0000

Coded Year^2

1.9770

0.0108

183.7105

0.0000

Yˆ  9.756  3.1876 X  1.9770 X 2 where X = years relative to 2010. Time Series II: Series II vs Coded Year, Coded Year Sq Regression Analysis Regression Statistics Multiple R

0.9999

R Square

0.9999

Copyright ©2024 Pearson Education, Inc.


cdxxxii Chapter 16: Time-Series Forecasting Adjusted R Square

0.9999

Standard Error

0.2049

Observations

12

ANOVA df

SS

Regression

2

Residual

9

0.3779

Total

11

3485.4692

MS

F

3485.0912 1742.5456 41495.8162 0.0420

Coefficients Standard Error

t Stat

P-value

Intercept

30.1920

0.1515

199.2630

0.0000

Coded Year

2.5818

0.0640

40.3180

0.0000

Coded Year^2

0.2103

0.0056

37.4855

0.0000

Yˆ  30.192  2.5818 X  0.2103 X 2 where X = years relative to 2010. 16.21 cont.

(b)

Time Series III: Series II vs Coded Year Simple Linear Regression Analysis

Regression Statistics Multiple R

1.0000

R Square

1.0000

Adjusted R Square

1.0000

Standard Error

0.1168

Observations

12

ANOVA df

SS

Copyright ©2024 Pearson Education, Inc.

MS

F


Solutions to End-of-Section and Chapter Review Problems cdxxxiii Regression

1

9130.4127 9130.4127 669277.5852

Residual

10

0.1364

Total

11

9130.5492

Coefficients

Standard Error

Intercept

60.0436

0.0634

946.6904

0.0000

Coded Year

7.9906

0.0098

818.0939

0.0000

0.0136

t Stat

P-value

Yˆ  60.0436  7.99056( X ) where X = years relative to 2010.

(c)

Forecasts where X = 12 for year 2022 in all models: Time Series I: Yˆ2022  9.756  3.188(12)  1.9770(122 )  332.691 Time Series II: Yˆ2022  30.192  2.5818(12)  0.21026(122 )  91.4523 Time Series III: Yˆ2022  60.0436  7.99056(12)  155.930

16.22

(a)

Time Series I: Data Y over Time XTime Series I: Data log (Y) over Time X

For Time Series I, the first differences are slightly more stable than second differences. The linear model appears to be the most appropriate model.

Copyright ©2024 Pearson Education, Inc.


cdxxxiv Chapter 16: Time-Series Forecasting 16.22 cont.

(a)

Time Series II: Data Y over Time XTime Series II: Data log (Y) over Time X

For Time Series II, the graph of log (Y) versus X appears to be more linear than the graph of Y versus X, so an exponential model appears to be more appropriate. (b)

Time Series I: Yˆ  100.0731  14.9776( X ) , where X = years relative to 2010 Simple Linear Regression Analysis Coefficients Standard Error

t Stat

P-value

Intercept

100.0731

0.0539

1857.8969

0.0000

Coded Year

14.9776

0.0083

1805.6429

0.0000

Time Series II: Yˆ  101.9982 0.0609( X ) , where X = years relative to 2010 Simple Linear Regression Analysis Coefficients

16.23

Standard Error

t Stat

Intercept

1.9982

0.0010 2003.1699

0.0000

Coded Year

0.0609

0.0002

0.0000

396.3726

(c)

X = 12 for year 2022 in all models. Forecasts for the year 2022 Time Series I: Yˆ  100.0731  14.9776(12)  279.8045 Time Series II Yˆ  101.99820.0609(12)  535.6886

(a) (b) (c) (d)

Four. Comparisons cannot be made for the first four observations. Five. One for each of the four variables and one for the intercept. The final four observations are needed to generate forecasts. Yˆi  a0  a1Yi 1  a2Yi  2  a3Yi 3  a4Yi  4 Yˆ  a  a Yˆ  a Yˆ  a Yˆ  a Yˆ

(e)

n j

P-value

0

1 n  j 1

2 n j 2

3 n  j 3

4 n j 4

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdxxxv

16.24

tSTAT 

a3 0.24   2.4 is greater than the critical bound of 2.2281. Reject H0. Sa3 0.10

There is sufficient evidence that the third-order regression parameter is significantly different from zero. A third-order autoregressive model is appropriate. 16.25

Ŷ18  a0  a1Y17  a2Y16  a3Y15 = 4.50 + (1.80)(36) + (0.80)(31) + (0.24)(25) = 100.1 Yˆ  a  a Yˆ  a Y  a Y = 4.50 + (1.80)(100.1) + (0.80)(36) + (0.24)(31) = 220.92 19

16.26

16.27

0

1 18

2 17

3 16

a3 0.24   1.6 is less than the critical bound of 2.2281. Do not reject H0. There is Sa3 0.15

(a)

tSTAT 

(b)

not sufficient evidence that the third-order regression parameter is significantly different than zero. A third-order autoregressive model is not appropriate. Fit a second-order autoregressive model and test to see if it is appropriate.

(a)

Regression Analysis: Total Sales (thousands) vs Lag1, Lag2, Lag3 Rows unused: 3 Regression Analysis

Regression Statistics Multiple R

0.9691

R Square

0.9392

Adjusted R Square

0.9313

Standard Error

70.6486

Observations

27

ANOVA df

SS

MS

F

Regression

3

1773802.0158 591267.3386 118.4613

Residual

23

114798.2805

Total

26

1888600.2963

Coefficients

Standard Error

Copyright ©2024 Pearson Education, Inc.

4991.2296

t Stat

P-value


cdxxxvi Chapter 16: Time-Series Forecasting Intercept

88.4888

44.6247

1.9830

0.0594

Lag1

1.7458

0.2044

8.5391

0.0000

Lag2

-1.0162

0.3571

-2.8454

0.0092

Lag3

0.1469

0.2043

0.7187

0.4796

For the third order term, tSTAT = 0.7187 with a p-value of 0.4796. The third term can be dropped because it is not significant at the 0.05 significance level.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdxxxvii 16.27 cont.

(b)

Regression Analysis: Total Sales (thousands) vs Lag1, Lag2 Rows unused: 2 Regression Analysis

Regression Statistics Multiple R

0.9679

R Square

0.9368

Adjusted R Square

0.9317

Standard Error

69.1224

Observations

28

ANOVA df

SS

MS

F

Regression

2

1770518.2193 885259.1096 185.2821

Residual

25

119447.4950

Total

27

1889965.7143

Coefficients

Standard Error

100.5300

38.4250

2.6163

0.0149

Lag1

1.6253

0.1272

12.7771

0.0000

Lag2

-0.7682

0.1270

-6.0483

0.0000

Intercept

4777.8998

t Stat

P-value

For the second order term, tSTAT = –6.0483 with a p-value of 0.0000. The second order term is significant at the 0.05 significance level. The second order term should be retained. (c)

A first-order autoregression is not necessary.

(d)

Yˆ2022  100.5300  1.6253Yˆ2021  0.7682Yˆ2020  100.5300  1.6253(690)  0.7682(650)  722.6907 thousand houses sold Copyright ©2024 Pearson Education, Inc.


cdxxxviii Chapter 16: Time-Series Forecasting

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdxxxix 16.28

(a)

Regression Analysis: Bonus ($ thousands) vs Lag1, Lag2, Lag3 Rows unused: 3 Regression Analysis

Regression Statistics Multiple R

0.6335

R Square

0.4014

Adjusted R Square

0.2816

Standard Error

33.9143

Observations

19

ANOVA df

SS

MS

Regression

3

11566.8807 3855.6269

Residual

15

17252.7361 1150.1824

Total

18

28819.6168

Coefficients

Standard Error

Intercept

50.4300

Lag1

F 3.3522

t Stat

P-value

37.7256

1.3368

0.2012

0.6323

0.2727

2.3182

0.0350

Lag2

-0.0376

0.3163

-0.1190

0.9069

Lag3

0.1436

0.2740

0.5239

0.6080

For the third order term, tSTAT = 0.5293 with a p-value of 0.6080. The third term can be dropped because it is not significant at the 0.05 significance level.

Copyright ©2024 Pearson Education, Inc.


cdxl Chapter 16: Time-Series Forecasting 16.28 cont.

(b)

Regression Analysis: Bonus ($ thousands) vs Lag1, Lag2 Rows unused: 2 Regression Analysis

Regression Statistics Multiple R

0.6951

R Square

0.4832

Adjusted R Square

0.4224

Standard Error

33.8617

Observations

20

ANOVA df

SS

MS

Regression

2

18224.0096 9112.0048

Residual

17

19492.4959 1146.6174

Total

19

37716.5055

Coefficients

Standard Error

Intercept

41.6635

Lag1 Lag2

F 7.9469

t Stat

P-value

31.2113

1.3349

0.1995

0.7421

0.2548

2.9125

0.0097

0.0329

0.2660

0.1236

0.9031

For the second order term, tSTAT = 0.1236 with a p-value of 0.9031. The second order term can be dropped because it is not significant at the 0.05 significance level.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdxli 16.28 cont.

(c)

Regression Analysis: Bonus ($ thousands) vs Lag1 Rows unused: 1 Simple Linear Regression Analysis

Regression Statistics Multiple R

0.7137

R Square

0.5094

Adjusted R Square

0.4836

Standard Error

33.5612

Observations

21

ANOVA df

SS

MS

F

Regression

1

22219.7257 22219.7257 19.7271

Residual

19

21400.7800

Total

20

43620.5057

Coefficients

Standard Error

Intercept

32.9804

27.1474

1.2149

0.2393

Lag1

0.8199

0.1846

4.4415

0.0003

1126.3568

t Stat

P-value

For the first order term, tSTAT = 4.4415 with a p-value of 0.0003. The first order term should be retained because it is significant at the 0.05 significance level. (d)

Yˆ2022  32.9804  0.8199Yˆ2021  32.9804  0.8199(257.5)  244.1040 $ thousands

Copyright ©2024 Pearson Education, Inc.


cdxlii Chapter 16: Time-Series Forecasting 16.29

(a)

Regression Analysis: Units Produced (thousands) vs Lag1, Lag2, Lag3 Rows unused: 3 Regression Analysis

Regression Statistics Multiple R

0.8477

R Square

0.7187

Adjusted R Square

0.6659

Standard Error

558.1040

Observations

20

ANOVA df

SS

MS

F

Regression

3 12731566.1591 4243855.3864 13.6248

Residual

16

Total

19 17715248.1221

4983681.9630

311480.1227

Coefficients

Standard Error

555.6187

629.1767

0.8831

0.3903

Lag1

0.9722

0.2446

3.9753

0.0011

Lag2

0.1195

0.3388

0.3526

0.7290

Lag3

-0.2685

0.2470

-1.0871

0.2931

Intercept

t Stat

P-value

For the third order term, tSTAT = –1.0871 with a p-value of 0.2931. The third term can be dropped because it is not significant at the 0.05 significance level.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdxliii 16.29 cont.

(b)

Regression Analysis: Units Produced (thousands) vs Lag1, Lag2 Rows unused: 2 Regression Analysis

Regression Statistics Multiple R

0.8483

R Square

0.7195

Adjusted R Square

0.6884

Standard Error

548.5111

Observations

21

ANOVA df

SS

MS

F

Regression

2 13893654.1967 6946827.0984 23.0896

Residual

18

Total

20 19309213.4279

5415559.2312

300864.4017

Coefficients

Standard Error

419.7450

539.4739

0.7781

0.4466

Lag1

0.9925

0.2355

4.2146

0.0005

Lag2

-0.1467

0.2403

-0.6107

0.5491

Intercept

t Stat

P-value

For the second order term, tSTAT = –0.6107 with a p-value of 0.5491. The second term can be dropped because it is not significant at the 0.05 significance level.

Copyright ©2024 Pearson Education, Inc.


cdxliv Chapter 16: Time-Series Forecasting 16.29 cont.

(c)

Regression Analysis: Units Produced (thousands) vs Lag1 Rows unused: 1 Simple Linear Regression Analysis

Regression Statistics Multiple R

0.8680

R Square

0.7534

Adjusted R Square

0.7411

Standard Error

529.4649

Observations

22

ANOVA df

SS

MS

F

Regression

1 17130328.9577 17130328.9577 61.1071

Residual

20

Total

21 22736990.7243

Intercept Units Produced

5606661.7666

280333.0883

Coefficients

Standard Error

t Stat

P-value

211.5935

455.6051

0.4644

0.6474

0.8975

0.1148

7.8171

0.0000

For the first order term, tSTAT = 7.8171 with a p-value of 0.0000. The first order term should be retained because it is significant at the 0.05 significance level. (d)

The most appropriate forecasting model is the first-order autoregressive model: Yˆ2022  211.5935  0.8975Yˆ2021  211.5935  0.8975(1562.717)  1614.1208 units produced (thousands)

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdxlv 16.30

(a)

Regression Analysis: MLB Salary ($ millions) vs Lag1, Lag2, Lag3 Rows unused: 3 Regression Analysis

Regression Statistics Multiple R

0.9722

R Square

0.9451

Adjusted R Square

0.9324

Standard Error

0.1754

Observations

17

ANOVA df

SS

MS

F

Regression

3

6.8791

2.2930 74.5682

Residual

13

0.3998

0.0308

Total

16

7.2788

Coefficients Standard Error

t Stat

P-value

Intercept

0.3679

0.2333

1.5769

0.1388

Lag1

0.9285

0.2936

3.1628

0.0075

Lag2

0.2200

0.4707

0.4675

0.6479

Lag3

-0.2278

0.3194 -0.7134

0.4882

For the third order term, tSTAT = –0.7134 with a p-value of 0.4882. The third term can be dropped because it is not significant at the 0.05 significance level.

Copyright ©2024 Pearson Education, Inc.


cdxlvi Chapter 16: Time-Series Forecasting 16.30 cont.

(b)

Regression Analysis: MLB Salary ($ millions) vs Lag1, Lag2 Rows unused: 2 Regression Analysis

Regression Statistics Multiple R

0.9756

R Square

0.9518

Adjusted R Square

0.9454

Standard Error

0.1667

Observations

18

ANOVA df

SS

MS

F

Regression

2

8.2355

4.1178 148.2401

Residual

15

0.4167

0.0278

Total

17

8.6522

Coefficients

Standard Error

Intercept

0.3331

Lag1 Lag2

t Stat

P-value

0.1955

1.7042

0.1090

1.0074

0.2508

4.0169

0.0011

-0.0716

0.2434

-0.2943

0.7725

For the second order term, tSTAT = –0.2943 with a p-value of 0.7725. The second term can be dropped because it is not significant at the 0.05 significance level.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdxlvii 16.30 cont.

(c)

Regression Analysis: MLB Salary ($ millions) vs Lag1 Rows unused: 1 Simple Linear Regression Analysis

Regression Statistics Multiple R

0.9767

R Square

0.9539

Adjusted R Square

0.9512

Standard Error

0.1665

Observations

19

ANOVA df

SS

MS

Regression

1

9.7549

9.7549 351.9942

Residual

17

0.4711

0.0277

Total

18

10.2260

Coefficients Standard Error

(d)

F

t Stat

P-value

Intercept

0.2440

0.1793

1.3605

0.1914

Lag1

0.9601

0.0512 18.7615

0.0000

For the first order term, tSTAT = 18.7615 with a p-value of 0.0000. The first order term should be retained because it is significant at the 0.05 significance level. Yˆ2023  0.2440  0.9601Yˆ2022  0.2440  0.9601(4.41)  4.4780 $ millions

Copyright ©2024 Pearson Education, Inc.


cdxlviii Chapter 16: Time-Series Forecasting 16.31

(a)

Regression Analysis: Solar Power Generated (gigawatts) vs Lag1, Lag2, Lag3 Rows unused: 3 Regression Analysis

Regression Statistics Multiple R

0.9949

R Square

0.9899

Adjusted R Square

0.9838

Standard Error

4436.3453

Observations

9

ANOVA df

SS

MS

F

Regression

3 9634507851.3840 3211502617.1280 163.1765

Residual

5

Total

8 9732913648.0000

98405796.6160

19681159.3232

Coefficients

Standard Error

8785.1072

3652.7954

2.4050

0.0612

Lag1

1.1944

0.3776

3.1635

0.0250

Lag2

-0.7640

0.5581

-1.3689

0.2293

Lag3

0.8203

0.4051

2.0249

0.0987

Intercept

t Stat

P-value

For the third order term, tSTAT = 2.0249 with a p-value of 0.0987. The third term can be dropped because it is not significant at the 0.05 significance level.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdxlix 16.31 cont.

(b)

Regression Analysis: Solar Power Generated (gigawatts) vs Lag1, Lag2 Rows unused: 2 Regression Analysis

Regression Statistics Multiple R

0.9921

R Square

0.9844

Adjusted R Square

0.9799

Standard Error

5169.1234

Observations

10

ANOVA df

SS

MS

F

Regression

2 11768299798.4597 5884149899.2299 220.2165

Residual

7

Total

9 11955338656.4000

Coefficients Intercept

187038857.9403

Standard Error

26719836.8486

t Stat

P-value

4453.9291

3147.9158

1.4149

0.2000

Lag1

1.2742

0.4055

3.1425

0.0163

Lag2

-0.1216

0.4658

-0.2611

0.8015

For the second order term, tSTAT = –0.2611 with a p-value of 0.8015. The second order term is not significant at the 0.05 significance level. The second order term can be dropped.

Copyright ©2024 Pearson Education, Inc.


cdl Chapter 16: Time-Series Forecasting 16.31 cont.

(c)

Regression Analysis: Solar Power Generated (gigawatts) vs Lag1 Rows unused: 1 Simple Linear Regression Analysis

Regression Statistics Multiple R

0.9926

R Square

0.9853

Adjusted R Square

0.9837

Standard Error

4771.7587

Observations

11

ANOVA df

SS

MS

F

Regression

1 13778502076.9679 13778502076.9679 605.1249

Residual

9

Total

10 13983429210.7273

Coefficients Intercept

204927133.7594

Standard Error

22769681.5288

t Stat

P-value

3959.9648

2195.5439

1.8036

0.1048

1.1845

0.0482

24.5993

0.0000

Lag1

For the first order term, tSTAT = 24.5993 with a p-value of 0.0000. The first order term should be retained because it is significant at the 0.05 significance level. (d)

Yˆ2022  3,959.9648  1.1845Yˆ2021  3,959.9648  1.1845(114,678)  139,798.3124 gigawatts n

16.32

(a)

SYX 

 (Y  Yˆ ) i 1

i

i

n  p 1

2

45  2.121 . The standard error of the estimate is 2.121. 12  1  1

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdli n

 Y  Yˆ

(b)

MAD  i 1

i

i

n

18  1.5 . The mean absolute deviation is 1.5. 12

n

16.33

(a)

SYX 

 (Y  Yˆ ) i 1

i

2

i

n  p 1

335.24  5.790 . The standard error of the estimate is 5.790. 12  1  1

n

 Y  Yˆ

(b)

MAD  i 1

i

n

i

39.2  3.267 . The mean absolute deviation is 3.267. 12

Copyright ©2024 Pearson Education, Inc.


cdlii Chapter 16: Time-Series Forecasting 16.34

(a)

Linear Displays large curvilinear pattern. Do not consider this model.

Quadratic Does not show a pattern. Consider this model.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdliii 16.34 cont.

(a)

Exponential Displays large curvilinear pattern. Do not consider this model.

First-order Autoregressive Does not show a pattern. Consider this model.

Copyright ©2024 Pearson Education, Inc.


cdliv Chapter 16: Time-Series Forecasting 16.34 cont.

(b–c) Solar Power Linear Quadratic Exponential AR-First Order (d)

r2 0.9409 0.9936 0.927 0.9853

SYX 9,641.926 3,340.838 29,413.36 4,771.759

MAD 7,074.251 2,251.934 16,012.19 2,975.726

Because the quadratic trend model had the highest r2 of the regression models and the first order regressive model had a significant t value, and they had no pattern in the residuals, those models should be considered. Quadratic model SYX = 3,340.838, MAD = 2,251.934. No strong evidence of a pattern in the residuals. First order autoregressive model: SYX = 4,771.759, MAD = 2,975.726. Because the quadratic model has lower SYX and MAD and similar r2, that model should be selected.

16.35

(a)

Linear Displays cyclical pattern. Do not consider this model.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdlv 16.35 cont.

(a)

Quadratic Displays cyclical pattern. Do not consider this model.

Exponential Displays cyclical pattern. Do not consider this model.

Copyright ©2024 Pearson Education, Inc.


cdlvi Chapter 16: Time-Series Forecasting 16.35 cont.

(a)

Second-Order Autoregressive Does not show a pattern. Consider this model.

(b–c) House Sales Linear Quadratic Exponential AR-Second Order (d)

r2 0.1568 0.1929 0.1708 0.9368

SYX 239.2122 238.3291 244.195 69.1224

MAD 188.4059 184.3723 191.9446 51.3408

Based on the results from (a) through (c), the residuals associated with the linear, quadratic, and exponential models have cyclical patterns. The second-order autoregressive model has no clear pattern. The second-order autoregressive model has the smallest Syx, and MAD values and highest r2. Based on these results and the principle of parsimony, the second-order autoregressive model would be the best option based on these results.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdlvii 16.36

(a)

Linear Displays slight cyclic pattern. Do not consider this model.

Quadratic Displays slight cyclic pattern. Do not consider this model.

Copyright ©2024 Pearson Education, Inc.


cdlviii Chapter 16: Time-Series Forecasting 16.36 cont.

(a)

Exponential Displays slight cyclic pattern. Do not consider this model.

First-Order Autoregressive Does not show a pattern. Consider this model.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdlix 16.36 cont.

(b–c) Bonus Linear Quadratic Exponential AR-First Order

16.37

r2 0.5838 0.5911 0.58 0.5094

SYX 30.8973 31.4203 30.6493 33.5612

MAD 22.6094 21.6598 21.5919 26.6907

(d)

The residual plots for the linear, quadratic, exponential reveal a slight cyclical pattern for the first part of the time series followed by no clear pattern for the remainder of the series. The first-order autoregressive revealed no clear pattern throughout the time series. Each of the models had similar r2, Sxy, and MAD values. On the basis of the residual plots and the principle of parsimony, the first-order autoregressive model might be the best choice for forecasting.

(a)

Linear Displays cyclic pattern. Do not consider this model.

Copyright ©2024 Pearson Education, Inc.


cdlx Chapter 16: Time-Series Forecasting 16.37 cont.

(a)

Quadratic Displays cyclic pattern. Do not consider this model.

Exponential Displays cyclic pattern. Do not consider this model.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdlxi 16.37 cont.

(a)

First-Order Autoregressive Does not show a pattern. Consider this model.

(b–c) Units Produced Linear Quadratic Exponential AR-First Order (d)

r2 0.6055 0.6058 0.5567 0.7534

SYX 702.192 719.2189 708.3523 529.4649

MAD 521.5868 520.7087 511.2103 375.6391

The residual plots for linear, quadratic, and exponential reveal cyclical patterns across coded year. The first-order autoregressive model has no clear pattern but does have one outlier. The first-order autoregressive model has the smallest Sxy and MAD values. On the basis of (a) through (c) and the principle of parsimony, the first-order autoregressive model would be the best model for forecasting.

Copyright ©2024 Pearson Education, Inc.


cdlxii Chapter 16: Time-Series Forecasting 16.38

(a)

Linear Displays cyclic pattern. Do not consider this model.

Quadratic Displays cyclic pattern. Do not consider this model.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdlxiii 16.38 cont.

(a)

Exponential Displays cyclic pattern. Do not consider this model.

First-Order Autoregressive Does not show a pattern. Consider this model.

Copyright ©2024 Pearson Education, Inc.


cdlxiv Chapter 16: Time-Series Forecasting 16.38 cont.

(b–c) MLB Salary Linear Quadratic Exponential AR-First Order (d)

16.39

SYX 0.2118 0.2133 0.2424 0.1665

MAD 0.1499 0.1498 0.1637 0.1069

The residual plots for linear, quadratic, and exponential reveal cyclical patterns across coded year. The first-order autoregressive model has no clear pattern. The Sxy and MAD values are similar across each of the models, but lower for autoregressive first order model. On the basis of (a) through (c) and the principle of parsimony, the first-order autoregressive model might be the best model for forecasting.

(a)

(b) (c) (d)

16.40

r2 0.9298 0.9328 0.9371 0.9539

(a) (b)

SYX  787.0082 MAD  629.8621 On the basis of (a) through (c), the linear model does not appear to be an adequate option because it does not account for cyclical variations. Other models would be more appropriate. One would not be satisfied with the linear trend forecasts in 16.13. b0  log Bˆ0  2 , Bˆ0  100 This is the unadjusted forecast. log Bˆ1  0.01, Bˆ1  1.0233 The estimated monthly compound growth rate is 2.33%. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdlxv log Bˆ2  0.10, Bˆ2  1.2589 The January values in the time series are estimated to have a mean of 25.89% higher than the December values.

16.40 cont.

(c)

16.41

To account for the seasonal component day of the week, six dummy variables would be needed.

16.42

(a) (b)

b0  log Bˆ0  3, then Bˆ0  1,000 This is the unadjusted forecast. b1  log Bˆ1  0.10, then Bˆ1  1.2589

(c)

The estimated quarterly compound growth rate is ( Bˆ1  1)100%  25.89% b3  log Bˆ3  0.20, then Bˆ3  1.5849 Bˆ3  1.5849 is the seasonal multiplier relative to the fourth quarter. This multiplier indicates that second quarter values are 58.49% greater than the fourth quarter values.

16.43

(a) (b) (c) (d)

16.44

(a)

Fitted value for Q4 of 2022: log Yˆ20  3.0  0.10(19)  4.9 Yˆ20  104.9  79, 432.82 Fitted value for Q1 of 2022: log Yˆ17  3.0  0.10(17)  0.25  4.45 Yˆ17  104.45  28,183.83 Forecast for Q4 of 2023: log Yˆ24  3.0  0.10(23)  5.30 Yˆ24  105.30  199,526.2315 Forecast for Q4 of 2023: log Yˆ21  3  0.10(21)  0.25  4.85 Yˆ21  104.85  70,794.58 The revenues for Target appear to be subject to seasonal variation given that revenues are consistently higher in the fourth quarter, which includes several substantial holidays.

(b)

The plot confirms the answer for (a) by clearly revealing a seasonal component to revenues.

Copyright ©2024 Pearson Education, Inc.


cdlxvi Chapter 16: Time-Series Forecasting 16.44 cont.

(c) Coefficients

Standard Error

t Stat

P-value

Intercept

1.0974

0.0120 91.3610

0.0000

Coded Quarter

0.0044

0.0002 25.4733

0.0000

Q1

-0.1275

0.0128

-9.9209

0.0000

Q2

-0.1157

0.0128

-9.0051

0.0000

Q3

-0.1161

0.0128

-9.0392

0.0000

Predict log(Revenue) = 1.0974 + 0.0044 Coded Quarter – 0.1275Q1 – 0.1557Q2 – 0.1161Q3 (d)

log10 ˆ1  0.0044; ˆ1  100.0044  1.0101

The estimated quarterly compound growth rate is ( Bˆ1  1)100%  1.01% (e) Quarter

bi  log ˆi

ˆi  10b

( ˆi  1)100%

First

–0.1275

0.7456

–25.44%

Second

–0.1157

0.7661

–23.39%

Third

–0.1161

0.7653

–23.47%

i

The first, second, and third quarter multipliers are –25.44%, –23.39%, and –23.47% relative to fourth quarter values, respectively. (f)

log(Revenue) = 1.0974 + 0.0044 Coded Quarter – 0.1275Q1 – 0.1557Q2 – 0.1161Q3 Predicted 2022 Q4 Revenue log(Revenue) = 1.0974 + 0.0044(91) = 1.4961 101.4961 = 31.3422 $million Predicted 2023 Q1 Revenue log(Revenue) = 1.0974 + 0.0044(92) – 0.1275 = 1.3730 101.3730 = 23.6065 $million Predicted 2023 Q2 Revenue log(Revenue) = 1.0974 + 0.0044(93) – 0.1557 = 1.3892 101.3892 = 24.5014 $million Predicted 2023 Q3 Revenue log(Revenue) = 1.0974 + 0.0044(94) – 0.1161 = 1.3931 101.3931 = 24.7243 $million Predicted 2023 Q4 Revenue log(Revenue) = 1.0974 + 0.0044(95) = 1.5137 101.5137 = 32.6329 $million

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdlxvii 16.45

(a)

(b) Coefficients

Standard Error

t Stat

P-value

Intercept

0.4322

0.0251

17.2338

0.0000

Coded Month

-0.0001

0.0001

-0.6549

0.5133

Jan

-0.0031

0.0314

-0.0983

0.9218

Feb

0.0044

0.0314

0.1397

0.8891

Mar

0.0311

0.0314

0.9906

0.3231

Apr

0.0445

0.0314

1.4171

0.1581

May

0.0616

0.0314

1.9607

0.0514

June

0.0692

0.0314

2.2031

0.0288

July

0.0550

0.0314

1.7520

0.0814

Aug

0.0567

0.0314

1.8062

0.0725

Sept

0.0509

0.0314

1.6199

0.1069

Oct

0.0391

0.0314

1.2465

0.2141

Nov

0.0136

0.0319

0.4264

0.6703

Copyright ©2024 Pearson Education, Inc.


cdlxviii Chapter 16: Time-Series Forecasting Predict log(Price) = 0.4322 – 0.0001 Coded Month – 0.0031Jan + 0.0044Feb + 0.0311Mar + 0.0445Apr + 0.0616May + 0.0692June + 0.0550July + 0.0567Aug + 0.0509Sept + 0.0391Oct + 0.0136Nov (c)

log10 ˆ1  0.0001; ˆ1  100.0001  0.9998

The estimated monthly compound growth rate is ( ˆ1  1)100%  0.02% after adjusting for the seasonal component.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdlxix 16.45 cont.

(d) Month

bi  log10 ˆi

ˆi  10b

( ˆi  1)100%

Jan

-0.0031

0.9929

-0.71%

Feb

0.0044

1.0102

1.02%

Mar

0.0311

1.0743

7.43%

Apr

0.0445

1.1079

10.79%

May

0.0616

1.1523

15.23%

June

0.0692

1.1727

17.27%

July

0.0550

1.1350

13.50%

Aug

0.0567

1.1395

13.95%

Sept

0.0509

1.1243

12.43%

Oct

0.0391

1.0943

9.43%

Nov

0.0136

1.0318

3.18%

i

The January, February, March, April, May, June, July, August, September, October, and November multipliers are –0.71%, 1.02%, 7.43%, 10.79%, 15.23%, 17.27%, 13.50%, 13.95%, 12.43%, 9.43%, and 3.18% relative to the December values, respectively. (e)

16.46

Gasoline prices are lower from November to March and higher from April to October with the highest prices occurring in June.

(a)

Copyright ©2024 Pearson Education, Inc.


cdlxx Chapter 16: Time-Series Forecasting

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdlxxi 16.46 cont.

(b) Coefficients

Standard Error

t Stat

P-value

Intercept

5.0230

0.0870 57.7213

0.0000

Coded Month

0.0033

0.0006

5.5910

0.0000

Jan

-0.1241

0.1081

-1.1473

0.2536

Feb

-0.1728

0.1081

-1.5981

0.1127

Mar

-0.1021

0.1081

-0.9442

0.3470

Apr

-0.1400

0.1081

-1.2949

0.1979

May

-0.1793

0.1081

-1.6585

0.0999

June

-0.2193

0.1081

-2.0286

0.0448

July

-0.2015

0.1081

-1.8643

0.0648

Aug

-0.1626

0.1081

-1.5041

0.1353

Sept

-0.1131

0.1081

-1.0465

0.2975

Oct

-0.0819

0.1107

-0.7399

0.4608

Nov

-0.0624

0.1106

-0.5643

0.5736

Predict log(Volume) = 5.0230 + 0.0033 Coded Month – 0.1241Jan – 0.1728Feb – 0.1021Mar – 0.1400Apr – 0.1793May – 0.2193June – 0.2015July – 0.1626Aug – 0.1131Sept – 0.0819Oct – 0.0624Nov (c)

Sept 2022 Fitted value: log Yˆ128  5.0230  0.0033(128)  0.1133  5.3293

(d)

Sept 2022 Forecast: log Yˆ128  5.0230  0.0033(128)  0.1133  5.3293 Yˆ128  105.3293  213, 445.7622 barrels Oct 2022 Forecast: log Yˆ129  5.0230  0.0033(129)  0.0819  5.4380 Yˆ129  105.4380  231,120.5189 barrels Nov 2022 Forecast: log Yˆ130  5.0230  0.0033(130)  0.0624  5.3866 Yˆ130  105.3866  243,530.12 barrels Dec 2022 Forecast: log Yˆ131  5.0230  0.0033(131)  5.4523 Yˆ131  105.4523  283,313.3103 barrels

(e)

log10 ˆ1  0.0033; ˆ1  100.0033  1.0076 Copyright ©2024 Pearson Education, Inc.


cdlxxii Chapter 16: Time-Series Forecasting The estimated monthly compound growth rate is ( ˆ1  1)100%  0.76% after adjusting for the seasonal component. (f)

log Bˆ July  0.2015, BˆJuly  0.6287

The multiplier for July is 0.6287, which means the volume is 37.12% lower in July than in December.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdlxxiii 16.47

(a)

(b)

The time series plot reveals a strong monthly seasonal pattern with high call volume that peaks between December and February and drops from March to lows in the summer months before rising again from October through February.

(c)

While the call volume varies seasonally, the overall volume remains fairly steady.

(d)

(e)

(f)

log10 ˆ1  0.000523; ˆ1  100.000523  0.9988.

The estimated monthly compound growth rate is ( ˆ1  1)100%  0.1204% after adjusting for the seasonal component. log ˆ2  0.0539 ˆ2  100.0539  0.8833625 ( ˆ  1)100%  11.6717% The January values are estimated to have a mean of 11.67% below the December values. Copyright ©2024 Pearson Education, Inc.


cdlxxiv Chapter 16: Time-Series Forecasting 16.47 cont.

(g) (h) (i)

16.48

Month 60: X = 59, M1 = M2 = M3 = M4 = M5 = M6 = M7 = M8 = M9 = M10 = M11 = 0. Yˆ60  27919.65195 Month 61: X = 60, M1 = 1; M2 = M3 = M4 = M5 = M6 = M7 = M8 = M9 = M10 = M11 = 0. Yˆ61  24633.71996 The call center can more accurately predict call center by month, which will allow the center to allocate resources more effectively to account for seasonal variation in call volume.

(a)

(b) Coefficients

Standard Error

t Stat

P-value

Intercept

1.0654

0.0454

23.4866

0.0000

Coded Quarter

0.0043

0.0007

5.7455

0.0000

Q1

0.0073

0.0485

0.1502

0.8810

Q2

-0.0069

0.0485

-0.1417

0.8877

Q3

-0.0014

0.0485

-0.0288

0.9771

Predict log(Price) = 1.0654 + 0.0043 Coded Quarter + 0.0073Q1 – 0.0069Q2 – 0.0014Q3 (c)

log10 ˆ1  0.0043; ˆ1  100.0043  1.0099

(d)

The estimated quarterly compound growth rate is ( ˆ1  1)100%  0.99% log ˆ  0.0073; ˆ  100.0073  1.0169 10

2

2

The 1st quarter values are estimated to have a mean of 1.69% above the 4th quarter values. A review of the p-values associated with the t test on the slope of the coefficients reveals Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdlxxv

(e)

that the slope coefficients for Quarter 1, Quarter 2, and Quarter 3 are not significant at the 0.05 significance level. 2022 Q4, X = 79, log Yˆ79  1.0654  0.0043(79)  1.4048 Yˆ79  101.4048  25.3951 (US$)

Copyright ©2024 Pearson Education, Inc.


cdlxxvi Chapter 16: Time-Series Forecasting 16.48

(f)

cont.

2023 Q1 Forecast: log Yˆ80  1.0654  0.0043(80)  0.0073  1.4163 Yˆ80  101.4163  26.0815 (US$) 2023 Q2 Forecast: log Yˆ81  1.0654  0.0043(81)  0.0069  1.4065 Yˆ81  101.4065  25.4957 (US$) 2023 Q3 Forecast: log Yˆ82  1.0654  0.0043(82)  0.0014  1.4162 Yˆ82  101.4162  26.0780 (US$) 2023 Q4 Forecast: log Yˆ83  1.0654  0.0043(83)  1.4219 Yˆ82  101.4219  26.4200 (US$)

(g)

16.49

The forecasts are not likely to be accurate given that that the quarterly exponential trend model did not fit the data particularly well. The adjusted r 2 = 0.2715. In addition, the time series contained an irregular component from 2010 through 2013.

(a)

(b) Coefficients

Standard Error

t Stat

P-value

Intercept

2.7778

0.0330

84.1762

0.0000

Coded Quarter

0.0072

0.0006

12.5343

0.0000

Q1

0.0027

0.0353

0.0763

0.9394

Q2

-0.0043

0.0353

-0.1212

0.9039

Q3

-0.0012

0.0353

-0.0354

0.9719

Predict log(Price) = 2.7778 + 0.0072 Coded Quarter + 0.0027Q1 – 0.0043Q2 – 0.0012Q3 Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdlxxvii (c)

log10 ˆ1  0.0072; ˆ1  100.0072  1.016676

The estimated quarterly compound growth rate is ( ˆ1  1)100%  1.67%

Copyright ©2024 Pearson Education, Inc.


cdlxxviii Chapter 16: Time-Series Forecasting 16.49 cont.

(d)

log10 ˆ2  0.0027; ˆ2  100.0027  1.006218 The 1st quarter values are estimated to have a mean of 0.6218% above the 4th quarter values. A review of the p-values associated with the t test on the slope of the coefficients reveals that the slope coefficients Quarter 1, Quarter 2, and Quarter 3 are not significant at the 0.05 significance level.

(e)

2022 Q4, X = 75, log Yˆ75  2.7778  0.0072(75)  3.3162 Yˆ75  103.3162  2071.10 (US$)

(f)

2023 Q1 Forecast: log Yˆ76  2.7778  0.0072(76)  0.0027  3.3261 Yˆ76  103.3261  2118.71 (US$) 2023 Q2 Forecast: log Yˆ77  2.7778  0.0072(77)  0.0043  3.3268 Yˆ77  103.3268  2119.73 (US$) 2023 Q3 Forecast: log Yˆ78  2.7778  0.0072(78)  0.0012  3.3365 Yˆ78  103.3365  2170.14 (US$) 2023 Q4 Forecast: log Yˆ79  2.7778  0.0072(79)  3.3449 Yˆ79  103.3449  2112.67 (US$)

(g)

The forecasts in (f) are not accurate because of downward shift in the price in the 2nd quarter of 2013 followed by a flattening in the price over the remaining quarters.

16.50

A time series is a set of numerical data obtained at regular periods over time.

16.51

A trend is the overall long-term tendency or impression of upward or downward movements. The cyclical component depicts the up-and-down swings or movements through the series. Any observed data that do not follow the trend curve modified by the cyclical component are indicative of the irregular component. When data are recorded monthly or quarterly, an additional component called the seasonal factor is considered.

16.52

Moving averages take into account the results of a limited number of periods of time. Exponential smoothing takes into account all the time periods but gives increased weight to more recent time periods.

16.53

The exponential trend model is appropriate when the percentage difference from observation to observation is constant.

16.54

The linear trend model in this chapter has the time period as the X variable.

16.55

Autoregressive models have independent variables that are the dependent variable lagged by a given number of time periods.

16.56

The different methods for choosing an appropriate forecasting model are residual analysis, the standard error of the estimate, the mean absolute deviation, and parsimony.

16.57

The standard error of the estimate relies on the squared sum of the deviations between the observed value and the predicted value. This measure gives increased weight to large differences. The mean absolute deviation is the mean of the absolute value of the deviations between the observed value and predicted value. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdlxxix 16.58

Forecasting for monthly or quarterly data uses an exponential trend model with dummy variables to represent either months or quarters.

16.59

(a)

(b)

Yˆ  0.8267  0.4253 X , where X = years since 1915

Copyright ©2024 Pearson Education, Inc.


cdlxxx Chapter 16: Time-Series Forecasting 16.59 cont.

(b)

(c)

1960: Yˆ  0.8267  0.4253(45)  19.97 1965: Yˆ  0.8267  0.4253(50)  22.09

(d) (e)

16.60

(a)

1970: Yˆ  0.8267  0.4253(55)  24.22 The actual rates, which varied across various sources located on the Internet were extremely low and almost non-existent for the years 1965 and 1970. The forecast made in (c) are not useful because the linear equation could not anticipate the discovery of a polio vaccine. Workforce:

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdlxxxi 16.60 cont.

(b) Simple Linear Regression Analysis

Regression Statistics Multiple R

0.9501

R Square

0.9028

Adjusted R Square

0.9002

Standard Error

4555.6473

Observations

39

ANOVA df

(c)

SS

MS

F

Regression

1 7130650204.4534 7130650204.4534 343.5808

Residual

37

Total

38 7898545325.4359

767895120.9825

20753922.1887

Coefficients

Standard Error

t Stat

P-value

Intercept

119451.9487

1431.3576

83.4536

0.0000

Coded Year

1201.4372

64.8167

18.5359

0.0000

Yˆ  119, 451.9487  1, 201.4372( X ) where X = years relative to 1984 Yˆ  119, 451.9487  1, 201.4372(39)  166,308.001 (thousands) 2023

Yˆ2024  119, 451.9487  1, 201.4372(40)  167,509.439 (thousands)

16.61

(a)

It would be reasonable to expect the price of natural gas would have a seasonal component which reflects the variation in the use of gas across seasonal temperature changes.

Copyright ©2024 Pearson Education, Inc.


cdlxxxii Chapter 16: Time-Series Forecasting 16.61 cont.

(b)

The time series plot for Commercial Price does appear to support the answer in (a) that there is a seasonal component. An overall downward adjustment in price in 2008 is followed by seasonal variation from 2011 through May of 2021.

The time series plot for Residential does appear to support the answer in (a) that there is a seasonal component from 2011 through 2022.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdlxxxiii 16.61 cont.

(c)

Commercial Price: Coefficients

Standard Error

t Stat

P-value

Intercept

0.9042

0.0134 67.6441

0.0000

Coded Month

-0.0001

0.0001

-1.0721

0.2858

Jan

-0.0031

0.0164

-0.1917

0.8483

Feb

0.0000

0.0164

0.0026

0.9979

Mar

0.0092

0.0164

0.5615

0.5755

Apr

0.0070

0.0168

0.4167

0.6776

May

0.0264

0.0168

1.5734

0.1182

June

0.0447

0.0168

2.6675

0.0087

July

0.0523

0.0168

3.1197

0.0023

Aug

0.0531

0.0168

3.1654

0.0020

Sept

0.0460

0.0168

2.7415

0.0070

Oct

0.0196

0.0168

1.1702

0.2442

Nov

0.0008

0.0168

0.0464

0.9631

Predict Commercial Price log(C Price) = 0.9042 – 0.0001 Coded Month – 0.0031Jan + 0.00004Feb + 0.0092Mar + 0.0070Apr + 0.0264May + 0.0447June + 0.0523July + 0.0531Aug + 0.0460Sept + 0.0196Oct + 0.0008Nov Residential Price: Coefficients

Standard Error

t Stat

P-value

Intercept

0.9613

0.0112

85.5486

0.0000

Coded Month

0.0004

0.0001

5.6424

0.0000

Jan

-0.0107

0.0138

-0.7734

0.4408

Feb

-0.0062

0.0138

-0.4481

0.6549

Mar

0.0149

0.0138

1.0787

0.2829

Copyright ©2024 Pearson Education, Inc.


cdlxxxiv Chapter 16: Time-Series Forecasting Apr

0.0474

0.0141

3.3578

0.0010

May

0.1199

0.0141

8.4989

0.0000

June

0.2020

0.0141

14.3247

0.0000

July

0.2444

0.0141

17.3376

0.0000

Aug

0.2591

0.0141

18.3780

0.0000

Sept

0.2336

0.0141

16.5723

0.0000

Oct

0.1255

0.0141

8.9025

0.0000

Nov

0.0271

0.0141

1.9258

0.0565

Predict Residential Price log(R Price) = 0.9613 + 0.0004 Coded Month – 0.0107Jan – 0.0062Feb + 0.0149Mar + 0.0474Apr + 0.1199May + 0.2020June + 0.2444July + 0.2591Aug + 0.2336Sept + 0.1255Oct + 0.0271Nov

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdlxxxv 16.61 cont.

(d)

Commercial: log10 ˆ1  0.0001; ˆ1  100.0001  0.9997853 The estimated monthly compound growth rate is ( ˆ1  1)100%  0.21% Residential: log10 ˆ1  0.0004; ˆ1  100.0004  1.0009504 The estimated monthly compound growth rate is ( ˆ1  1)100%  0.095%

(e)

Commercial: Month

bi  log10 ˆi

ˆi  10b

( ˆi  1)100%

Jan

-0.0031

0.9928

-0.72%

Feb

0.0000

1.0001

0.01%

Mar

0.0092

1.0214

2.14%

Apr

0.0070

1.0162

1.62%

May

0.0264

1.0627

6.27%

June

0.0447

1.1085

10.85%

July

0.0523

1.1280

12.80%

Aug

0.0531

1.1300

13.00%

Sept

0.0460

1.1116

11.16%

Oct

0.0196

1.0462

4.62%

Nov

0.0008

1.0018

0.18%

i

January, February, and March, April, and November are estimated to have very close to the December values. May, June, July, August, September, and October are estimated to have a mean above the December values. A review of the p-values associated with the t test on the slope of the coefficients reveals that the slope coefficients were not significant for the January, February, March, April and May estimates. The slope coefficients for June, July, August, and September were significant at the 0.05 significance level. The slope coefficients were not significant for the months of October and November. The multipliers indicate that the monthly residential prices for natural gas are highest in the summer months. The multipliers support the answers in (a) and (b). Residential: Month

bi  log10 ˆi

ˆi  10b

i

( ˆi  1)100%

Copyright ©2024 Pearson Education, Inc.


cdlxxxvi Chapter 16: Time-Series Forecasting

16.61 cont.

16.62

Jan

-0.0107

0.9757

-2.43%

Feb

-0.0062

0.9859

-1.41%

Mar

0.0149

1.0349

3.49%

Apr

0.0474

1.1152

11.52%

May

0.1199

1.3178

31.78%

June

0.2020

1.5921

59.21%

July

0.2444

1.7556

75.56%

Aug

0.2591

1.8158

81.58%

Sept

0.2336

1.7123

71.23%

Oct

0.1255

1.3350

33.50%

Nov

0.0271

1.0645

6.45%

(e)

January, February, and March are estimated to have very close to the December values. April, May, June, July, August, September, October, and November are estimated to have a mean above the December values. A review of the p-values associated with the t test on the slope of the coefficients reveals that the slope coefficients were not significant for the January, February, and March estimates. The slope coefficients for April, May, June, July, August, September, and October were significant at the 0.05 significance level. The November coefficient was not significant at the 0.05 level. The multipliers indicate that the monthly residential prices for natural gas are higher in the spring, summer months, and fall months with the highest prices occurring in the summer months. The multipliers support the answers in (a) and (b).

(f)

Both the residential and commercial price for natural gas appear to be highest in the summer months. The results also revealed that the seasonal pattern appears to be stronger residential prices compared to commercial prices.

(a)

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdlxxxvii

Copyright ©2024 Pearson Education, Inc.


cdlxxxviii Chapter 16: Time-Series Forecasting 16.62 cont.

(b)

Linear (Simple Linear Regression Analysis): Revenues vs Coded Year Regression Statistics Multiple R

0.9507

R Square

0.9038

Adjusted R Square

0.9016

Standard Error

2.8492

Observations

47

ANOVA df

SS

MS

F

Regression

1

3430.8592 3430.8592 422.6250

Residual

45

365.3089

Total

46

3796.1681

Coefficients

Standard Error

Intercept

-1.2784

Coded Year

0.6299

8.1180

t Stat

P-value

0.8181

-1.5626

0.1252

0.0306

20.5578

0.0000

Linear Predict: Yˆ  1.2784  0.6299( X ) , where X = years relative to 1975 tSTAT = 20.5578, p-value = 0.000 < 0.05, coded year is significant; r2 = 0.9038. 90.38% of the variation in predicted revenue in $billions is explained by year. (c)

Quadratic (Regression Analysis): Revenues vs Coded Year, Coded Year Sq Regression Statistics Multiple R

0.9515

R Square

0.9053

Adjusted R Square

0.9010

Standard Error

2.8582

Observations

47 Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdlxxxix ANOVA df

SS

MS

F 210.3511

Regression

2

3436.7307

1718.3653

Residual

44

359.4374

8.1690

Total

46

3796.1681

Coefficients

Standard Error

Intercept

-2.0198

1.1993

-1.6841

0.0992

Coded Year

0.7287

0.1206

6.0429

0.0000

Coded Year Sq

-0.0021

0.0025

-0.8478

0.4011

t Stat

P-value

Quadratic Predict: Yˆ  2.0198  0.7287 X  0.0021X , where X = years relative to 1975 For full model, F = 210.3511, p-value = 0.0002, at least one X variable is significant. For coded year2, tSTAT = –0.8478, p-value = 0.4011 > 0.05, Coded Year Sq is not significant; r2 = 0.9053. 90.53% of the variation in predicted revenue in $billions is explained by year. Exponential (Regression Analysis): Log(Revenues) vs Coded Year 2

16.62 cont.

(d)

Regression Statistics Multiple R

0.9485

R Square

0.8997

Adjusted R Square

0.8975

Standard Error

0.1372

Observations

47

ANOVA df

SS

MS

F

Regression

1

7.5963

7.5963 403.7741

Residual

45

0.8466

0.0188

Total

46

8.4428

Copyright ©2024 Pearson Education, Inc.


cdxc Chapter 16: Time-Series Forecasting

Coefficients

Standard Error

t Stat

P-value

Intercept

0.2813

0.0394

7.1437

0.0000

Coded Year

0.0296

0.0015 20.0941

0.0000

Exponential Predict:

log10 Yˆ  0.2813  0.0296( X ) where X = years relative to 1975 tSTAT = 20.0941, p-value = 0.000 < 0.05, coded year is significant; r2 = 0.8997. 89.97% of the variation in predicted revenue in $billions is explained by year.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdxci 16.62 cont.

(e)

Autoregressive: Regression Analysis for Revenue vs Lag1, Lag2, Lag3 (Rows unused: 3) Coefficients

Standard Error

t Stat

P-value

Intercept

0.5271

0.3205

1.6444

0.1079

Lag1

1.0714

0.2021

5.3006

0.0000

Lag2

0.2729

0.3637

0.7504

0.4574

Lag3

-0.3617

0.2154

-1.6792

0.1009

For the third order term, tSTAT = –1.6792 with a p-value of 0.1009. The third order term can be dropped because it is not significant at the 0.05 significance level. Regression Analysis for Revenue vs Lag1, Lag2 (Rows unused: 2) Coefficients

Standard Error

t Stat

P-value

Intercept

0.5582

0.3115

1.7923

0.0803

Lag1

1.2587

0.1718

7.3271

0.0000

Lag2

-0.2722

0.1692

-1.6086

0.1152

For the second order term, tSTAT = –1.6086 with a p-value of 0.1152. The second order term can be dropped because it is not significant at the 0.05 significance level. Regression Analysis for Revenue vs Lag1 (Rows unused: 1) Regression Statistics Multiple R

0.9923

R Square

0.9847

Adjusted R Square

0.9844

Standard Error

1.1241

Observations

46

ANOVA df Regression

SS 1

MS

F

3588.2881 3588.2881 2839.8938

Copyright ©2024 Pearson Education, Inc.


cdxcii Chapter 16: Time-Series Forecasting Residual

44

55.5953

1.2635

Total

45

3643.8834

Coefficients

Standard Error

Intercept

0.6699

0.2919

2.2949

0.0266

Lag1

0.9856

0.0185

53.2907

0.0000

t Stat

P-value

For the first order term, tSTAT = 53.2907 with a p-value of 0.0000. The first order term cannot be dropped because it is significant at the 0.05 significance level. The first-order model is appropriate. Autoregressive Predict

Yˆi  0.6699  0.9856(Yi 1 )

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdxciii 16.62 cont.

(f)

Linear:Displays large curvilinear pattern. Do not consider this model.

Quadratic: Displays large curvilinear pattern. Do not consider this model.

Copyright ©2024 Pearson Education, Inc.


cdxciv Chapter 16: Time-Series Forecasting 16.62 cont.

(f)

Exponential: Displays large curvilinear pattern. Do not consider this model.

First-Order Autoregressive:Does not show a pattern. Consider this model.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdxcv 16.62 cont.

(g)

Linear Quadratic Exponential Autoregressive-1st (h)

(i)

16.63

Syx 2.8492 2.8582 6.5339 1.1241

MAD 2.1531 2.2484 3.8816 0.7533

The residuals plots reveal clear cyclical patterns for the linear, quadratic, and exponential models. The residual plot for first-order autoregressive model revealed no cyclical pattern. Based on the results from (f), (g), and the principle of parsimony, the first-order autoregressive model would be best suited for forecasting.

Yˆi  0.6699  0.9856(Yi 1 ) , where Yi 1  23.2 Yˆ2022  0.6699  0.9856(23.2)  23.536 $billion

(a)

Copyright ©2024 Pearson Education, Inc.


cdxcvi Chapter 16: Time-Series Forecasting 16.63 cont.

(b)

Diversified: Linear Diversified Equity vs Coded Year Simple Linear Regression Analysis Regression Statistics Multiple R

0.9143

R Square

0.8359

Adjusted R Square

0.8315

Standard Error

12.0538

Observations

39

ANOVA df

SS

MS

F

Regression

1

27387.6853 27387.6853 188.4984

Residual

37

5375.8791

Total

38

32763.5644

Coefficients

Standard Error

Intercept

10.8112

3.7872

2.8546

0.0070

Coded Year

2.3546

0.1715

13.7295

0.0000

145.2940

t Stat

P-value

Linear Model Diversified Equity: Predicted Diversified Equity = Yˆ  10.8112  2.3546 X where X = years relative to 1984. t = 13.7295, p-value = 0.0000 < 0.05. Coded year is significant. r2 = 0.8359. 83.59% of the variation in Diversified Equity is explained by year. Balanced: Linear Balanced vs Coded Year; Simple Linear Regression Analysis Regression Statistics Multiple R

0.5978

R Square

0.3573

Adjusted R Square

0.3399

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdxcvii Standard Error

2.0963

Observations

39

ANOVA df

16.63 cont.

(c)

SS

MS

F

Regression

1

90.3961 90.3961 20.5711

Residual

37

162.5900

Total

38

252.9861

Coefficients

Standard Error

4.3943

t Stat

P-value

Intercept

14.6039

0.6586 22.1730

0.0000

Coded Year

0.1353

0.0298

0.0001

4.5355

Linear Model Balanced: Predicted Balanced = Yˆ  14.6039  0.1353 X where X = years relative to 1984. t = 4.5355, p-value = 0.0001 < 0.05. Coded year is significant. r2 = 0.3573. 35.73% of the variation in Balanced is explained by year. Diversified: Quadratic Diversified Equity vs Coded Year, Coded Year Sq Regression Analysis Regression Statistics Multiple R

0.9176

R Square

0.8420

Adjusted R Square

0.8332

Standard Error

11.9932

Observations

39

ANOVA

df

SS

MS

F

Regression

2

27585.4383 13792.7192 95.8914

Residual

36

5178.1261

Total

38

32763.5644

Copyright ©2024 Pearson Education, Inc.

143.8368


cdxcviii Chapter 16: Time-Series Forecasting Coefficients

Standard Error

t Stat

P-value

Intercept

15.4733

5.4780

2.8246

0.0077

Coded Year

1.5986

0.6670

2.3967

0.0219

Coded Year Sq

0.0199

0.0170

1.1725

0.2487

Quadratic Model: Predicted Diversified Equity = Yˆ  15.4733  1.5986 X  0.0199 X 2 where X = years relative to 1984. For Coded Year2, t = 1.1725, p-value = 0.2487 > 0.05. Coded Year2 not is significant. r2 = 0.8420. 84.20% of the variation in diversified equity is explained by year. Balanced: Quadratic Balanced vs Coded Year, Coded Year Sq, Regression Analysis Regression Statistics Multiple R

0.9847

R Square

0.9696

Adjusted R Square

0.9679

Standard Error

0.4624

Observations

39

ANOVA

df

SS

MS

Regression

2

Residual

36

7.6967

Total

38

252.9861

Coefficients

Standard Error

Intercept

10.4778

Coded Year Coded Year Sq

F

245.2894 122.6447 573.6492 0.2138

t Stat

P-value

0.2112

49.6111

0.0000

0.8044

0.0257

31.2811

0.0000

-0.0176

0.0007

-26.9163

0.0000

Quadratic Model: Predicted Diversified Equity = Yˆ  10.4778  0.8044 X  0.0176 X 2 where X = years relative to 1984. For Coded Year2, t = –26.9163, p-value = 0.0000 < 0.05. Coded Year2 is significant. r2 = 0.9696. 96.96% of the variation in balanced is explained by year. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdxcix 16.63 cont.

(d)

Diversified: Exponential log(Diversified Equity) vs Coded Year Regression Analysis Regression Statistics Multiple R

0.9202

R Square

0.8469

Adjusted R Square

0.8427

Standard Error

0.1068

Observations

39

ANOVA

df

SS

MS

F

Regression

1

2.3337

2.3337 204.6059

Residual

37

0.4220

0.0114

Total

38

2.7557

Coefficients

Standard Error

t Stat

P-value

Intercept

1.2606

0.0336 37.5681

0.0000

Coded Year

0.0217

0.0015 14.3041

0.0000

Exponential model: log (predicted Diversified Equity) = log10 Yˆ  1.2606  0.0217( X ) where X = years relative to 1984 t = 14.3041, p-value = 0.0000 < 0.05; Coded Year is significant; r2 = 0.8469. 84.69% of the variation in the log of Diversified Equity is explained by year. Balanced: Exponential log(Balanced) vs Coded Year, Regression Analysis Regression Statistics Multiple R

0.6128

R Square

0.3755

Adjusted R Square

0.3587

Standard Error

0.0585

Copyright ©2024 Pearson Education, Inc.


d Chapter 16: Time-Series Forecasting Observations

39

ANOVA

df

SS

MS

F

Regression

1

0.0763

0.0763 22.2505

Residual

37

0.1268

0.0034

Total

38

0.2031

Coefficients Standard Error

t Stat

P-value

Intercept

1.1547

0.0184 62.7733

0.0000

Coded Year

0.0039

0.0008

0.0000

4.7170

Exponential model:

16.63 cont.

log (predicted Balanced) = log10 Yˆ  11.1547  0.0039( X ) where X = years relative to 1984 t = 4.7170, p-value = 0.0000 < 0.05; Coded Year is significant; r2 = 0.3755. 37.55% of the variation in the log of Balanced is explained by year. (e) Diversified Equity: Third-Order Autoregressive Diversified Equity vs Lag1, Lag2, Lag3 Regression Analysis Coefficients

Standard Error

t Stat

P-value

Intercept

1.2631

3.6890

0.3424

0.7343

Lag1

1.0847

0.1833

5.9187

0.0000

Lag2

-0.2515

0.2627

-0.9572

0.3456

Lag3

0.2098

0.1913

1.0967

0.2810

For the third order term, tSTAT = 1.0967 with a p-value of 0.2810. The third order term can be dropped because it is not significant at the 0.05 significance level. Diversified Equity: Second-Order Autoregressive Diversified Equity vs Lag1, Lag2 Regression Analysis Coefficients

Standard Error

t Stat

P-value

Intercept

1.8183

3.4520

0.5267

0.6018

Lag1

1.0777

0.1807

5.9635

0.0000

Lag2

-0.0540

0.1885

-0.2865

0.7762

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems di For the second order term, tSTAT = –0.2865 with a p-value of 0.7762. The second order term can be dropped because it is not significant at the 0.05 significance level. Diversified Equity: First-Order Autoregressive Diversified Equity vs Lag1 Regression Analysis Regression Statistics Multiple R

0.9549

R Square

0.9118

Adjusted R Square

0.9094

Standard Error

8.7009

Observations

38

ANOVA

df

SS

MS

F

Regression

1

28189.8623 28189.8623 372.3640

Residual

36

2725.3843

Total

37

30915.2465

Coefficients Standard Error

75.7051

t Stat

P-value

Intercept

1.4834

3.1890

0.4652

0.6446

Lag1

1.0316

0.0535

19.2967

0.0000

For the first order term, tSTAT = 19.2967 with a p-value of 0.0000. The first order term should be retained because it is significant at the 0.05 significance level. Autoregression model: (First Order) Predicted Diversified Equity = Yˆi  1.4834  1.0316Yˆi 1

Copyright ©2024 Pearson Education, Inc.


dii Chapter 16: Time-Series Forecasting 16.63 cont.

(e) Balanced: Third-Order Autoregressive Balanced vs Lag1, Lag2, Lag3 Regression Analysis Coefficients

Standard Error

t Stat

P-value

Intercept

0.8444

0.3726

2.2661

0.0303

Lag1

1.4867

0.1765

8.4222

0.0000

Lag2

-0.3684

0.3128

-1.1777

0.2476

Lag3

-0.1646

0.1610

-1.0223

0.3143

For the third order term, tSTAT = –1.0223 with a p-value of 0.3143. The third order term can be dropped because it is not significant at the 0.05 significance level. Balanced: Second-Order Autoregressive Balanced vs Lag1, Lag2 Regression Analysis Regression Statistics Multiple R

0.9949

R Square

0.9897

Adjusted R Square

0.9891

Standard Error

0.2228

Observations

37

ANOVA

df

SS

MS

81.4387 1640.0175

Regression

2

162.8774

Residual

34

1.6883

Total

36

164.5657

Coefficients Standard Error

F

0.0497

t Stat

P-value

Intercept

0.8740

0.3354

2.6060

0.0135

Lag1

1.6222

0.1133

14.3230

0.0000

Lag2

-0.6698

0.1020

-6.5683

0.0000

For the second order term, tSTAT = –6.5683 with a p-value of 0.0000. The second order term should be retained because it is significant at the 0.05 significance level. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems diii Autoregression model: (Second Order) Predicted Balanced = Yˆi  0.8740  1.6222Yˆi 1  0.6698Yˆi  2

Copyright ©2024 Pearson Education, Inc.


div Chapter 16: Time-Series Forecasting 16.63 cont.

(f)

Linear

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dv 16.63 cont.

(f)

Quadratic:

Copyright ©2024 Pearson Education, Inc.


dvi Chapter 16: Time-Series Forecasting 16.63 cont.

(f)

Exponential:

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dvii 16.63 cont.

(f)

Autoregressive:

Copyright ©2024 Pearson Education, Inc.


dviii Chapter 16: Time-Series Forecasting 16.63 cont.

(g) Diversified Equity Linear Quadratic Exponential AR-First Order

r2 0.8359 0.842 0.8469 0.9118

SYX 12.0538 11.9932 12.5313 8.7009

MAD 8.0378 8.362 9.5213 5.9918

r2 0.3573 0.9696 0.3755 0.9897

SYX 2.0963 0.4624 2.1897 0.2228

MAD 1.8259 0.2891 1.9171 0.1556

Balanced Linear Quadratic Exponential AR-Second Order (h)

For the Diversified Equity Fund, the first-order autoregressive model appears to be the best model based on the results from (f) and (g) and the principle of parsimony. The residual plots for the other models revealed clear patterns while the residual plot for the first-order autoregressive model revealed no clear pattern. The first-order autoregressive model also had the lowest Syx and MAD values, and highest r2. For the Balanced Fund, the second-order autoregressive model appears to be the best model based on the results from (f) and (g) and the principle of parsimony. The residual plots for linear, quadratic, and exponential models had strong patterns. The residual plot for the second-order autoregressive model revealed less of a pattern relative to the other models. The second-order autoregressive model had the lowest Syx and MAD values, and highest r2.

(i)

Diversified: (First Order Autoregression model) Yˆ2023  1.4834  1.0316Yˆ2022  1.4834  1.0316(133.741)

 139.4525 Balanced: (Second Order Autoregression model) Yˆ2023  0.8740  1.6222Yˆ2022  0.6698Yˆ2021  0.8740  1.6222(17.271)  0.6698(17.000) (j)

 17.5036 Based on the results from (a) through (i), one would recommend that a member of the Teacher’s Retirement System of the City of New York should invest most of their retirement in the Diversified Equity Fund with possibly a small percentage in the Balanced Fund. The member should be advised that the Diversified Equity Fund does have more risk than the Balanced Fund and that this should be considered. If the member prefers to have almost no risk, most of the retirement should be invested in the Balanced Fund. However, the member should be aware the value of the Diversified Equity Fund increased by 920% from 1984 to 2022 compared to only a 67% increase in value for the Stable-Value Fund. A second-order autoregressive model was able to account for 99% of the variation in the Balanced Fund price while a first-order model was able to account for 91% of the variation in the Diversified Equity Fund price. For most individuals willing to take some risk, the Diversified Equity Fund would clearly be the fund to invest most of a member’s retirement funds. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dix

Copyright ©2024 Pearson Education, Inc.


dx Chapter 16: Time-Series Forecasting 16.64

Each of the currencies Canadian dollar (CAD), Japanese yen (JPY), and British pound (BPD) are expressed in units per U.S. dollar (USD). A time series analysis of the Canadian dollar (CAD) reveals a moderate component with up and down cycles of varying durations. Although the currency rate varies in cycles, the 1980 exchange rate is fairly similar to the 2022 exchange rate. A review of residual plots reveals that the linear, quadratic, and exponential models may be problematic due to cyclical variation in the residuals. In contrast, a first-order autoregressive model revealed a random pattern of residuals. In addition, the first-order autoregressive model had the smallest standard error of the estimate and MAD values, and highest r2. The first-order autoregressive was the most appropriate model to use for forecasting. Using this model, the forecasted exchange rate is 1.2573 (units per $ U.S.) and 1.2606 (units per $ U.S.) for 2023 and 2024, respectively. A time series analysis of the Japanese yen exchange rate revealed a steep drop in the rate beginning in 1986. The rate dropped from 238.47 (units per $ U.S.) in 1985 to 128.17 (units per $ U.S.) in 1988. The exchange rate had a cyclical component from that point forward with an overall declining trend. A review of the residual plots, standard error of the estimate, MAD values, and r2, revealed that the first-order autoregressive model was the most appropriate for forecasting. Using this model, the forecasted exchange rate is 109.5270 (units per $ U.S.) and 109.2481 (units per $ U.S.) for 2023 and 2024, respectively. A time series analysis of the English pound reveals a moderate component with up and down cycles of varying durations. Although the currency rate varies in cycles, the 1980 exchange rate has increased from 0.4302 (units per $ U.S.) in 1980 to 0.7847 (units per $ U.S.) in 2019. A review of the residual plots and the standard error of the estimate and MAD values revealed that the first-order autoregressive model was the most appropriate for forecasting. Using this model, the forecasted exchange rate is 0.7076 (units per $ U.S.) and 0.6941 (units per $ U.S.) for 2023 and 2024, respectively. An unexpected irregular component in the future could not be anticipated by the autoregressive models used for each of the currencies.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dxi

Copyright ©2024 Pearson Education, Inc.


dxii Chapter 16: Time-Series Forecasting 16.64 cont.

CAD Linear

Simple Linear Regression Analysis: CAD vs Coded Year Regression Statistics Multiple R

0.1067

R Square

0.0114

Adjusted R Square

-0.0127

Standard Error

0.1489

Observations

43

ANOVA

df

SS

MS

F 0.4724

Regression

1

0.0105

0.0105

Residual

41

0.9088

0.0222

Total

42

0.9193

Coefficients

Standard Error

t Stat

Copyright ©2024 Pearson Education, Inc.

P-value


Solutions to End-of-Section and Chapter Review Problems dxiii Intercept

1.2902

0.0446 28.9090

0.0000

Coded Year

-0.0013

0.0018

0.4958

-0.6873

Predicted CAD = Yˆ  1.2902  0.0013 X where X = years relative to 1980. t = –0.6873, p-value = 0.4958 > 0.05. Coded year not is significant. r2 = 0.0114. 1.14% of the variation in CAD is explained by year.

Copyright ©2024 Pearson Education, Inc.


dxiv Chapter 16: Time-Series Forecasting 16.64 cont.

CAD Quadratic

Regression Analysis: CAD vs Coded Year, Coded Year Sq Regression Statistics Multiple R

0.1282

R Square

0.0164

Adjusted R Square

-0.0328

Standard Error

0.1503

Observations

43

ANOVA

df

SS

MS

F 0.3340

Regression

2

0.0151

0.0075

Residual

40

0.9042

0.0226

Total

42

0.9193

Coefficients

Standard Error

t Stat

Copyright ©2024 Pearson Education, Inc.

P-value


Solutions to End-of-Section and Chapter Review Problems dxv Intercept

1.2685

0.0657 19.3067

0.0000

Coded Year

0.0019

0.0072

0.2637

0.7934

Coded Year Sq

-0.0001

0.0002

-0.4524

0.6534

Predicted CAD = Yˆ  1.2685  0.0019 X  0.0001X 2 where X = years relative to 1980. t = –0.4524, p-value = 0.6534 > 0.05. Coded year not is significant. r2 = 0.0164. 1.64% of the variation in CAD is explained by year.

Copyright ©2024 Pearson Education, Inc.


dxvi Chapter 16: Time-Series Forecasting 16.64 cont.

CAD Exponential

Simple Linear Regression Analysis: log(CAD) vs Coded Year Regression Statistics Multiple R

0.1223

R Square

0.0150

Adjusted R Square

-0.0091

Standard Error

0.0520

Observations

43

ANOVA

df

SS

MS

F 0.6227

Regression

1

0.0017

0.0017

Residual

41

0.1109

0.0027

Total

42

0.1126

Coefficients

Standard Error

t Stat

Copyright ©2024 Pearson Education, Inc.

P-value


Solutions to End-of-Section and Chapter Review Problems dxvii Intercept

0.1093

0.0156

7.0085

0.0000

Coded Year

-0.0005

0.0006

-0.7891

0.4346

log(predicted CAD) = log10 Yˆ  0.1093  0.0005( X ) where X = years relative to 1980. t = –0.7891, p-value = 0.44346 > 0.05. Coded year not is significant. r2 = 0.0150. 1.50% of the variation in CAD is explained by year.

Copyright ©2024 Pearson Education, Inc.


dxviii Chapter 16: Time-Series Forecasting 16.64

CAD Autoregressive

cont.

Autoregressive Third Order regression had third order term Lag3 p-value = 0.2025 > 0.05. The third order term can be dropped. Autoregressive Second Order regression had second order term Lag2 p-value = 0.5148 > 0.05. The second order term can be dropped.

Simple Linear Regression Analysis: CAD vs Lag1 Regression Statistics Multiple R

0.8164

R Square

0.6666

Adjusted R Square

0.6582

Standard Error

0.0871

Observations

42

ANOVA

df

SS

MS

F 79.9618

Regression

1

0.6067

0.6067

Residual

40

0.3035

0.0076

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dxix Total

16.64 cont.

41

0.9101

Coefficients

Standard Error

t Stat

P-value

Intercept

0.2391

0.1156

2.0680

0.0451

Lag1

0.8124

0.0909

8.9421

0.0000

Predicted CAD = Yˆi  0.2391  0.8124Yˆi 1 For the first order term, tSTAT = 8.9421, p-value = 0.0000 < 0.05. Coded year is significant. r2 = 0.6666. 66.66% of the variation in CAD is explained by year. CAD CAD Linear Quadratic Exponential AR-First Order

r2 0.0114 0.0164 0.0150 0.6666

SYX 0.1489 0.1503 0.1492 0.0871

MAD 0.1200 0.1204 0.1210 0.0694

Japanese yen (JPY)

Copyright ©2024 Pearson Education, Inc.


dxx Chapter 16: Time-Series Forecasting 16.64 cont.

JPY Linear

Simple Linear Regression Analysis: JPY vs Coded Year Regression Statistics Multiple R

0.7218

R Square

0.5210

Adjusted R Square

0.5093

Standard Error

31.7874

Observations

43

ANOVA

df

SS

MS

Regression

1

45061.4687 45061.4687

Residual

41

41427.8707

Total

42

86489.3394

Coefficients

Standard Error

F 44.5961

1010.4359

t Stat

Copyright ©2024 Pearson Education, Inc.

P-value


Solutions to End-of-Section and Chapter Review Problems dxxi Intercept

186.7181

9.5284

19.5960

0.0000

Coded Year

-2.6086

0.3906

-6.6780

0.0000

Predicted JPY = Yˆ  186.7181  2.6086 X where X = years relative to 1980. t = –6.6780, p-value = 0.0000 < 0.05. Coded year is significant. r2 = 0.5210. 52.10% of the variation in JPY is explained by year.

Copyright ©2024 Pearson Education, Inc.


dxxii Chapter 16: Time-Series Forecasting 16.64 cont.

JPY Quadratic

Regression Analysis: JPY vs Coded Year, Coded Year Sq Regression Statistics Multiple R

0.8989

R Square

0.8080

Adjusted R Square

0.7984

Standard Error

20.3744

Observations

43

ANOVA

df

SS

MS

Regression

2

69884.6962 34942.3481

Residual

40

16604.6432

Total

42

86489.3394

Coefficients

Standard Error

F 84.1749

415.1161

t Stat

Copyright ©2024 Pearson Education, Inc.

P-value


Solutions to End-of-Section and Chapter Review Problems dxxiii Intercept

236.8211

8.9039

26.5976

0.0000

Coded Year

-9.9408

0.9807

-10.1367

0.0000

Coded Year Sq

0.1746

0.0226

7.7329

0.0000

Predicted JPY = Yˆ  236.8211  9.9408 X  0.1746 X 2 where X = years relative to 1980. t = 7.7329, p-value = 0.0000 < 0.05. Coded year is significant. r2 = 0.8080. 80.80% of the variation in JPY is explained by year.

Copyright ©2024 Pearson Education, Inc.


dxxiv Chapter 16: Time-Series Forecasting 16.64 cont.

JPY Exponential

Simple Linear Regression Analysis: log(JPY) vs Coded Year Regression Statistics Multiple R

0.7320

R Square

0.5359

Adjusted R Square

0.5245

Standard Error

0.0880

Observations

43

ANOVA

df

SS

MS

F 47.3342

Regression

1

0.3670

0.3670

Residual

41

0.3179

0.0078

Total

42

0.6848

Coefficients

Standard Error

t Stat

Copyright ©2024 Pearson Education, Inc.

P-value


Solutions to End-of-Section and Chapter Review Problems dxxv Intercept

2.2565

0.0264 85.4939

0.0000

Coded Year

-0.0074

0.0011

0.0000

-6.8800

log(predicted JPY) = log10 Yˆ  2.2565  0.0074( X ) where X = years relative to 1980. t = –6.8800, p-value = 0.0000 < 0.05. Coded year is significant. r2 = 5359. 53.59% of the variation in JPY is explained by year.

Copyright ©2024 Pearson Education, Inc.


dxxvi Chapter 16: Time-Series Forecasting 16.64

JPY Autoregressive

cont.

Autoregressive Third Order regression had third order term Lag3 p-value = 0.6463 > 0.05. The third order term can be dropped. Autoregressive Second Order regression had second order term Lag2 p-value = 0.0841 > 0.05. The second order term can be dropped.

Simple Linear Regression Analysis: JPY vs Lag1 Regression Statistics Multiple R

0.9396

R Square

0.8829

Adjusted R Square

0.8799

Standard Error

15.0461

Observations

42

ANOVA

df

SS

MS

F

Regression

1

68253.7451 68253.7451 301.4936

Residual

40

9055.4160

226.3854

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dxxvii Total

41

77309.1610

Coefficients

Standard Error

Intercept

11.6679

7.1823

1.6245

0.1121

Lag1

0.8909

0.0513

17.3636

0.0000

t Stat

P-value

Predicted JPY = Yˆi  11.6679  0.8909Yˆi 1 For the first order term, tSTAT = 17.3636, p-value = 0.0000 < 0.05. Coded year is significant. r2 = 0.8829. 88.29% of the variation in JPY is explained by year. 16.64 cont. JPY Linear Quadratic Exponential AR-First Order

r2

SYX

MAD

0.5210 0.8080 0.5359 0.8829

31.7874 20.3744 30.3981 15.0461

26.4410 16.3532 22.9516 10.6744

British pound (BPD)

Copyright ©2024 Pearson Education, Inc.


dxxviii Chapter 16: Time-Series Forecasting 16.64 cont.

BPD Linear

Simple Linear Regression Analysis: BPD vs Coded Year Regression Statistics Multiple R

0.4775

R Square

0.2280

Adjusted R Square

0.2091

Standard Error

0.0785

Observations

43

ANOVA

df

SS

MS

F 12.1074

Regression

1

0.0746

0.0746

Residual

41

0.2528

0.0062

Total

42

0.3274

Coefficients

Standard Error

t Stat

Copyright ©2024 Pearson Education, Inc.

P-value


Solutions to End-of-Section and Chapter Review Problems dxxix Intercept

0.5677

0.0235 24.1201

0.0000

Coded Year

0.0034

0.0010

0.0012

3.4796

Predicted BPD = Yˆ  0.5677  0.0034 X where X = years relative to 1980. t = 3.4796, p-value = 0.0012 < 0.05. Coded year is significant. r2 = 0.2280. 22.80% of the variation in BPD is explained by year.

Copyright ©2024 Pearson Education, Inc.


dxxx Chapter 16: Time-Series Forecasting 16.64 cont.

BPD Quadratic

Regression Analysis: BPD vs Coded Year, Coded Year Sq Regression Statistics Multiple R

0.5661

R Square

0.3205

Adjusted R Square

0.2865

Standard Error

0.0746

Observations

43

ANOVA

df

SS

MS

F 9.4330

Regression

2

0.1049

0.0525

Residual

40

0.2225

0.0056

Total

42

0.3274

Coefficients

Standard Error

t Stat

Copyright ©2024 Pearson Education, Inc.

P-value


Solutions to End-of-Section and Chapter Review Problems dxxxi Intercept

0.6230

0.0326 19.1164

0.0000

Coded Year

-0.0047

0.0036

-1.3210

0.1940

Coded Year Sq

0.0002

0.0001

2.3336

0.0247

Predicted BPD = Yˆ  0.6230  0.0047 X  0.0002 X 2 where X = years relative to 1980. t = 2.3336, p-value = 0.0247 < 0.05. Coded year is significant. r2 = 0.3205. 32.05% of the variation in BPD is explained by year.

Copyright ©2024 Pearson Education, Inc.


dxxxii Chapter 16: Time-Series Forecasting 16.64 cont.

BPD Exponential

Simple Linear Regression Analysis: log(BPD) vs Coded Year Regression Statistics Multiple R

0.4722

R Square

0.2230

Adjusted R Square

0.2041

Standard Error

0.0546

Observations

43

ANOVA

df

SS

MS

F 11.7672

Regression

1

0.0350

0.0350

Residual

41

0.1221

0.0030

Total

42

0.1572

Coefficients

Standard Error

t Stat

Copyright ©2024 Pearson Education, Inc.

P-value


Solutions to End-of-Section and Chapter Review Problems dxxxiii Intercept

-0.2475

0.0164

-15.1303

0.0000

Coded Year

0.0023

0.0007

3.4303

0.0014

log(predicted BPD) = log10 Yˆ  0.2475  0.0023( X ) where X = years relative to 1980. t = 3.4303, p-value = 0.0.0014 < 0.05. Coded year is significant. r2 = 0.2230. 22.30% of the variation in BPD is explained by year.

Copyright ©2024 Pearson Education, Inc.


dxxxiv Chapter 16: Time-Series Forecasting 16.64

BPD Autoregressive

cont.

Autoregressive Third Order regression had third order term Lag3 p-value = 0.5215 > 0.05. The third order term can be dropped. Autoregressive Second Order regression had second order term Lag2 p-value = 0.1946 > 0.05. The second order term can be dropped.

Simple Linear Regression Analysis: BPD vs Lag1 Regression Statistics Multiple R

0.7569

R Square

0.5728

Adjusted R Square

0.5622

Standard Error

0.0550

Observations

42

ANOVA

df

SS

MS

F 53.6421

Regression

1

0.1622

0.1622

Residual

40

0.1209

0.0030

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dxxxv Total

41

0.2831

Coefficients

Standard Error

t Stat

P-value

Intercept

0.1899

0.0625

3.0402

0.0042

Lag1

0.7125

0.0973

7.3241

0.0000

Predicted BPD = Yˆi  0.1899  0.7125Yˆi 1 For the first order term, tSTAT = 7.3241, p-value = 0.0000 < 0.05. Coded year is significant. r2 = 0.5728. 58.28% of the variation in BPD is explained by year. 16.64 cont. BPD Linear Quadratic Exponential AR-First Order

r2 0.2280 0.3205 0.2230

SYX 0.0785 0.0746 0.0782

MAD 0.0617 0.0570 0.0610

0.5728

0.055

0.0433

Copyright ©2024 Pearson Education, Inc.



Copyright ©2024 Pearson Education, Inc. v


vi Chapter 18: Getting Ready to Analyze Data in the Future

Chapter 17

17.1

The three major categories of business analytics are descriptive, predictive, and prescriptive. Descriptive analytics summarize historical data to facilitate the identification of potential patterns or trends that could lead to more in-depth analyses. Predictive analytics utilize historical data to understand relaionships among variables or to predict values of a dependent variable. Prescriptive analytics evaluate business models to facilitate the identification of operational improvement strategies.

17.2

Limited information technology and data management systems prevented the widespread adoption of business analytics in the past.

17.3

What is data mining? Data mining invovles the extraction of valuable data through model building and/or descriptive or predictive analytics.

17.4

Articifial intelligence is a type of computer science that utilizes software solutions to simulate human expertice, reasoning, or knowledge. Machine learning is one of the articifial intelligence tools used to automate model building.

17.5

Exploratory models facilitate the understanding of the relationship among variables. Predictive models seek to predict individual cases rather than an estimate of the general case. Dashboards typically allow users to drill down to various levels of detail.

17.6

Decision rules compare decision criteria to facilitate prediction.

17.7

Dashboards facilitate the exploration of data by displaying critical pieces of information in a visual format that allows users to quickly interpret the overall status of an activity or event.

17.8

Data dimesionality represents the number of variables associated with visualizing a data item. Color, size, and motion represent additional dimensions that can be used to describe a data item.

17.9

Classification trees and regression trees are decision trees that split data into groups based on the values of independent or explanatory variables. Classificaiton trees use categorical dependent variable while regression trees use numerical dependent variables.

17.10

Clustering involves the grouping of items into sets based on the similarity of items. A calculated distance is used to determine the similarity of items. Association analyses assesses that similarity of the values that comprise one item.

17.11

Text analytics represents a blend of desriptive and prescriptive analytics. Text analytics utilize clustering and association methods to automate analysis and interpretation of text.

17.12

A large language model (LLM) is a type of artificial intelligence algorithm using deep learning techniques on massive amounts of data.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems vii 17.13

The two primary approaches to prescriptive analytics are optimization and simulation. Optimization involves the setting of constraints to assist in the development of a decision modeel that represents the optimal way of managing a business process. Simulation involves the repeating of a predictive analytics model by varying the assumptions or the data associated with the model. Decision criteria are used to select a particular version of the model, which may or may not be optimal. Simulation can be used when a business process is not well understood.

Chapter 18

18.1

Output for the summary statistics for the processing time:

If you assume that the 20 books from the two production represent independent events, one can use either the pooled-variance t test or the separate variance t test to determine whether there is a significant different between the two means if the normality assumption is met for both plants.

Copyright ©2024 Pearson Education, Inc.


viii Chapter 18: Getting Ready to Analyze Data in the Future

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ix 18.1 cont.

Histogram plots reveal that processing time for both plants is somewhat right skewed. A Wilcoxon rank sum test may be appropriate in this situation. However, because both t tests are robust to the departure from normality, one of these tests will be performed.

Copyright ©2024 Pearson Education, Inc.


x Chapter 18: Getting Ready to Analyze Data in the Future 18.1 cont.

Since the p-value = 0.287 is larger than 0.05, do not reject H0. There is not sufficient evidence of a difference in the two population variances.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xi 18.1 cont.

Since the p-value = 0.183 is greater than 0.05, do not reject H0. There is not sufficient evidence of a difference in the population mean processing time in the two plants. A Wilcoxon nonparametric rank sum test also failed to reveal a significance difference in the median processing time between the two plants. All tests performed revealed insufficient evidence of a difference in processing time between the two plants.

Copyright ©2024 Pearson Education, Inc.


xii Chapter 18: Getting Ready to Analyze Data in the Future 18.2

Please note that for this Problem: ―Employees‖ represents ―Travel & Tourism Employees (thousands)‖ ―Nights‖ represents ―Nights Spent at Tourism Establishments (millions)‖ ―Expenditures‖ represents ―Accommodation Expenditures at Tourism Establishments (thousands of euros)‖ ―Establishments‖ represents ―Tourism Establishments (thousands)‖ Descriptive Summary of Travel and Tourism Descriptive Summary

Mean Median Mode Minimum Maximum Range Variance Standard Deviation Coeff. of Variation Skewness Kurtosis Count Standard Error

Employees Nights Expenditures Establishments 6820.533333 62.0709685 5410744.926 20.2177 2988.3 22.356157 1067998.545 5.1515 #N/A #N/A #N/A #N/A 269.6 1.322284 30805.56 0.268 41500 324.389046 59606179.21 220.457 41230.4 323.066762 59575373.65 220.189 90863108.4561 8619.6867 129583421076066.0000 1903.4944 9532.2142 92.8423 11383471.3983 43.6291 139.76% 149.57% 210.39% 215.80% 2.3608 2.0121 4.0772 3.8119 5.6568 2.8184 18.7172 16.0905 30 30 30 30 1740.3363 16.9506 2078328.0225 7.9655

Multiple Regression Analysis: Employees vs Nights, Expenditures, Establishments Coefficients Standard Error t Stat P-value

VIF

Intercept

1125.6632

603.7475

1.8645

0.0736

Nights

56.8394

10.8787

5.2248

0.0000

3.9597

Expenditures

0.0004

0.0001

5.9620

0.0000

2.4254

Establishments

-3.6231

16.7572

-0.2162

0.8305

2.0748

Because the goal of 18.2 is to predict the value of a dependent variable, travel and tourism jobs (Employees), one would use multiple regression. As a first step in the model building process, a review of the VIFs reveals that all three independent variables have VIF values below five.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xiii 18.2 cont.

Best Subsets Analysis: Employees vs Nights, Expenditures, Establishments Model

Cp

k+1

R Square

Adj. R Square

Std. Error

X1

43.4125

2

0.8032

0.7962 4303.5715

X2

49.8956

2

0.7848

0.7771 4500.0600

X3

247.5345

2

0.2245

0.1968 8543.1137

X1X2

2.0467

3

0.9262

0.9207 2684.6272

X1X3

37.5458

3

0.8255

0.8126 4126.7352

X2X3

29.2989

3

0.8489

0.8377 3840.3121

X1X2X3

4.0000

4

0.9263

0.9178 2733.3115

A Best Subsets analysis reveals a number of models that had Cp values equal to k + 1 or less. The two-variable model of nights (X1) and expenditures (X2) had the lowest Cp value (2.0467) and an adjusted r2 = 0.9207. Although one other model had the similar adjusted r2 it included three variables and had a slightly higher Cp value. Based on the principle of parsimony, the twovariable model of Nights and Expenditures appears to be the preferred model in this case. Multiple Regression Analysis: Employees vs Nights, Expenditures Regression Statistics Multiple R

0.9624

R Square

0.9262

Adjusted R Square

0.9207

Standard Error

2684.6272

Observations

30

ANOVA df

SS

MS

Regression

2

2440435118.7175

1220217559.3588

Residual

27

194595026.5091

7207223.2040

Copyright ©2024 Pearson Education, Inc.

F 169.3048


xiv Chapter 18: Getting Ready to Analyze Data in the Future Total

29

2635030145.2267

Coefficients

Standard Error

t Stat

P-value

Intercept

1121.7726

592.7305

1.8926

0.0692

Nights

55.2040

7.6796

7.1884

0.0000

Expenditures

0.0004

0.0001

6.7047

0.0000

Predict Employees = 1,121.7726 + 55.2040 Nights + 0.0004 Expenditures The F test of the overall model reveals that the model is significant at the 0.05 significance level. Because FSTAT = 169.3048 or p-value = 0.0000, reject H0. Individual t tests on the independent variables are significant at the 0.05 level, which suggests that the two independent variables in the model should be included. The adjusted r2 of this model is 0.9207. The r2 of 0.9262 indicates that 92.62% of the variation in used Employees can be explained by the variation in Nights and Expenditures.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xv 18.2 cont.

Copyright ©2024 Pearson Education, Inc.


xvi Chapter 18: Getting Ready to Analyze Data in the Future 18.2 cont.

For both of the two variables, residual plots reveal no clear pattern, indicating no evidence for violation of the equal variance or linearity assumptions. The normal probability plot revealed no evidence of significant departure from the normality assumption.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xvii 18.3

Descriptive Summary of Best Cities

Descriptive Summary

Mean Median Mode Minimum Maximum Range Variance Standard Deviation Coeff. of Variation Skewness Kurtosis Count Standard Error

Average Annual Unemployment Average Median Home Price Median Age Salary ($) Rate Commute Time 330824.66 39.027 59820.5 7.672 25.01 302596 38.35 52875 7.6 24.4 #N/A 37.2 #N/A 8.1 22.8 92450 30.7 39250 4.3 19.4 1455741 52.9 625560 11.9 34.8 1363291 22.2 586310 7.6 15.4 36266084291.2772 14.9470 3335481948.3939 2.1115 10.1213 190436.5624 3.8661 57753.6315 1.4531 3.1814 57.56%

9.91%

96.54%

18.94%

12.72%

2.9281 13.4183 100 19043.6562

1.4034 2.4918 100 0.3866

9.6868 95.7280 100 5775.3631

0.2309 0.3287 100 0.1453

0.7925 0.4608 100 0.3181

Multiple Regression Analysis: Median Home Price vs Median Age, Average Annual Salary, Unemployment Rate, Average Commute Time Coefficients

Standard Error

t Stat

P-value

VIF

Intercept

-77791.7592

198959.2076

-0.3910

0.6967

Median Age

-10192.7152

4180.6788

-2.4381

0.0166

1.0562

-0.0471

0.2753

-0.1711

0.8645

1.0218

Unemployment Rate

-6961.3447

11388.8321

-0.6112

0.5425

1.1072

Average Commute Time

34491.5294

5113.5102

6.7452

0.0000

1.0699

Average Annual Salary ($)

Because the goal of 18.3 is to predict the value of a dependent variable, median sales price of homes, one would use multiple regression. As a first step in the model building process, a review of the VIFs reveals that all of the four variables, has a VIF below 5.

Copyright ©2024 Pearson Education, Inc.


xviii Chapter 18: Getting Ready to Analyze Data in the Future 18.3 cont.

Best Subsets Analysis: Median Home Price vs Median Age, Average Annual Salary, Unemployment Rate, Average Commute Time Best Subsets Analysis

Intermediate Calculations R2T

0.352051

1 - R2T

0.647949

n

100

T

5

n-T

95

Model

Cp

k+1

R Square

Adj. R Square

Std. Error

X1

45.7184

2

0.0334

0.0235

188181.4015

X2

50.0472

2

0.0039

-0.0063

191033.7864

X3

50.4230

2

0.0013

-0.0089

191279.4235

X4

6.2276

2

0.3028

0.2956

159826.0705

X1X2

47.4087

3

0.0355

0.0156

188942.1382

X1X3

46.8380

3

0.0394

0.0196

188560.4962

X1X4

1.3929

3

0.3494

0.3360

155184.6726

X2X3

51.8206

3

0.0054

-0.0151

191866.9556

X2X4

8.2144

3

0.3028

0.2885

160637.4282

X3X4

6.9452

3

0.3115

0.2973

159637.0537

X1X2X3

48.4974

4

0.0417

0.0118

189310.7168

X1X2X4

3.3736

4

0.3495

0.3292

155975.0643

X1X3X4

3.0293

4

0.3519

0.3316

155693.2517

X2X3X4

8.9441

4

0.3115

0.2900

160465.4419

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xix X1X2X3X4

5.0000

5

0.3521

0.3248

156486.4220

A Best Subsets analysis reveals a number of models that had Cp values equal to k + 1 or less. The two-variable model of median age (X1) and average commute time (X4) had the lowest Cp value (1.3929) and an adjusted r2 = 0.3360. The two-variable model of Median Age and Average Commute Time appears to be the preferred model in this case.

Copyright ©2024 Pearson Education, Inc.


xx Chapter 18: Getting Ready to Analyze Data in the Future 18.3 cont.

Multiple Regression Analysis: Median Home Price vs Median Age, Average Commute Time Regression Statistics Multiple R

0.5911

R Square

0.3494

Adjusted R Square

0.3360

Standard Error

155184.6726

Observations

100

ANOVA df

SS

MS

F 26.0432

Regression

2

1254360932349.1600

627180466174.5810

Residual

97

2335981412487.2800

24082282602.9617

Total

99

3590342344836.4400

Coefficients

Standard Error

t Stat

P-value

Intercept

-96397.8894

194672.5492

-0.4952

0.6216

Median Age

-10653.9446

4041.3288

-2.6362

0.0098

Average Commute Time

33707.0790

4911.1519

6.8634

0.0000

Predict Median Home Price = –96,397.8894 –10,653.9446 median age + 33,707.0790 average commute time The F test of the overall model reveals that the model is significant at the 0.05 significance level. Because FSTAT = 26.0432 or p-value = 0.000, reject H0. Individual t tests on the independent variables are significant at the 0.05 level, which suggests that both of the independent variables should be included. The model including two independent variables, median age and average commute time represents the most appropriate model. The adjusted r2 of this model is 0.3360. The r2 of 0.3494 indicates that 34.94% of the variation in median home price can be explained by the variation in median age and average commute time.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxi 18.3 cont.

Copyright ©2024 Pearson Education, Inc.


xxii Chapter 18: Getting Ready to Analyze Data in the Future 18.3 cont.

For both of the two variables, residual plots reveal no clear pattern, indicating no evidence for violation of the equal variance or linearity assumptions. The normal probability plot revealed no evidence of significant departure from the normality assumption.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxiii 18.4

Descriptive Summary of MLB Attendance Study

Descriptive Summary

Mean Median Mode Minimum Maximum Range Variance Standard Deviation Coeff. of Variation Skewness Kurtosis Count Standard Error

Season Mean Attendance Attendance Wins per Game Stadium Capacity 2149.786233 81.03333333 26.8081 43293.23333 2282.0015 81 28.5245 42120.5 #N/A 74 #N/A #N/A 787.902 55 9.973 31042 3861.408 111 47.671 56000 3073.506 56 37.698 24958 598646.8575 216.9989 91.4334 30593431.4954 773.7227 14.7309 9.5621 5531.1329 35.99% 18.18% 35.67% 12.78% 0.0815 0.1825 0.0709 0.2755 -0.6432 -0.7822 -0.6627 0.2322 30 30 30 30 141.2618 2.6895 1.7458 1009.8421

Mean Capacity Team Payroll Team Value Percentage ($millions) ($billions) 61.43215526 132.1365634 2.074 61.02860178 135.75455 1.73 #N/A #N/A #N/A 21.14267543 46.011667 0.99 93.90507667 222.205 6 72.76240124 176.193333 5.01 362.5600 1921.2932 1.2997 19.0410 43.8326 1.1401 31.00% 33.17% 54.97% -0.1768 0.1613 1.8739 -0.8514 -0.6412 3.8080 30 30 30 3.4764 8.0027 0.2081

Multiple Regression Analysis: Season Attendance vs Wins, Mean Attendance per Game, Stadium Capacity, Mean Capacity Percentage, Team Payroll ($millions), Team Value ($billions) Coefficients

Standard Error

t Stat

P-value

Intercept

133.9804

Wins

VIF

188.9378

0.7091

0.4854

-1.2155

0.6121

-1.9858

0.0591

2.0950

Mean Attendance per Game

86.1712

6.9257

12.4423

0.0000

113.0619

Stadium Capacity

-0.0021

0.0042

-0.5019

0.6205

14.1711

Mean Capacity Percentage

-1.2117

2.9529

-0.4103

0.6854

81.4715

Team Payroll

0.0185

0.1984

0.0934

0.9264

1.9479

Team Value

-15.2767

8.1984

-1.8634

0.0752

2.2513

Because the goal of 18.4 is to predict the value of a dependent variable, season attendance, one would use multiple regression. As a first step in the model building process, a review of the VIFs reveals that mean attendance per game has the largest value and is well above five. Multiple Regression Analysis: Season Attendance vs Wins, Stadium Capacity, Mean Capacity Percentage, Team Payroll ($millions), Team Value ($billions)

Intercept Wins

Coefficients

Standard Error

t Stat

P-value

-2091.5977

165.6186

-12.6290

0.0000

0.6654

1.6144

0.4122

0.6839

Copyright ©2024 Pearson Education, Inc.

VIF

1.9672


xxiv Chapter 18: Getting Ready to Analyze Data in the Future Stadium Capacity

0.0482

0.0035

13.8867

0.0000

1.2811

Mean Capacity Percentage

34.9331

1.4426

24.2149

0.0000

2.6246

Team Payroll

-0.5543

0.5251

-1.0556

0.3017

1.8430

Team Value

13.8482

21.3864

0.6475

0.5234

2.0678

A second regression analysis was performed after eliminating the mean attendance per game variable because it had the highest VIP value that was above 5. The second regression analysis reveals that all five of the remaining independent variables in this model have VIFs lower than 5.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxv 18.4 cont.

Best Subsets Analysis: Season Attendance vs Wins, Stadium Capacity, Mean Capacity Percentage, Team Payroll ($millions), Team Value ($billions) Model

Cp

k+1

R Square

Adj. R Square

Std. Error

X1

1136.7460

2

0.4416

0.4217

588.4075

X2

1478.7172

2

0.2774

0.2516

669.3653

X3

239.7515

2

0.8724

0.8678

281.3025

X4

1399.3913

2

0.3155

0.2910

651.4826

X5

1167.5012

2

0.4268

0.4064

596.1385

X1X2

836.8057

3

0.5866

0.5560

515.5674

X1X3

233.8780

3

0.8762

0.8670

282.1889

X1X4

789.8166

3

0.6092

0.5802

501.2982

X1X5

814.4472

3

0.5973

0.5675

508.8277

X2X3

1.7550

3

0.9876

0.9867

89.1793

X2X4

1100.0511

3

0.4602

0.4202

589.1500

X2X5

1028.3004

3

0.4946

0.4572

570.0366

X3X4

236.4575

3

0.8749

0.8657

283.5968

X3X5

196.8519

3

0.8939

0.8861

261.1460

X4X5

1064.9480

3

0.4770

0.4383

579.8778

X1X2X3

3.1952

4

0.9879

0.9865

89.8849

X1X2X4

617.5854

4

0.6928

0.6574

452.8740

X1X2X5

692.0896

4

0.6571

0.6175

478.5248

X1X3X4

227.4929

4

0.8802

0.8664

282.8507

X1X3X5

194.8909

4

0.8958

0.8838

263.7235

X1X4X5

703.7411

4

0.6515

0.6113

482.4129

X2X3X4

2.6974

4

0.9881

0.9868

88.9925

X2X3X5

3.6388

4

0.9877

0.9863

90.6728

Copyright ©2024 Pearson Education, Inc.


xxvi Chapter 18: Getting Ready to Analyze Data in the Future X2X4X5

934.7652

4

0.5405

0.4875

553.8993

X3X4X5

198.7588

4

0.8940

0.8817

266.0646

X1X2X3X4

4.4193

5

0.9883

0.9864

90.2425

X1X2X3X5

5.1142

5

0.9879

0.9860

91.5176

X1X2X4X5

590.3603

5

0.7069

0.6600

451.1676

X1X3X4X5

196.8400

5

0.8959

0.8792

268.9147

X2X3X4X5

4.1699

5

0.9884

0.9865

89.7805

X1X2X3X4X5

6.0000

6

0.9885

0.9861

91.3092

A Best Subsets analysis reveals a number of models that had Cp values equal to k + 1 or less. The two-variable model of stadium capacity (X2) and mean capacity percentage (X3) had the lowest Cp value (1.7550) and an adjusted r2 = 0.9867. Although one other model had the same adjusted r2 it included three variables and had a slightly higher Cp value. Based on the principle of parsimony, the two-variable model of Stadium Capacity and Mean Capacity Percentage appears to be the preferred model in this case.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxvii 18.4 cont.

Multiple Regression Analysis: Season Attendance vs Stadium Capacity, Mean Capacity Percentage Regression Statistics Multiple R

0.9938

R Square

0.9876

Adjusted R Square

0.9867

Standard Error

89.1793

Observations

30

ANOVA df

SS

MS

F

Regression

2 17146029.3009 8573014.6505 1077.9670

Residual

27

Total

29 17360758.8687

214729.5678

7952.9470

Coefficients

Standard Error

-2103.3507

133.4018

-15.7670

0.0000

Stadium Capacity

0.0486

0.0031

15.8618

0.0000

Mean Capacity Percentage

35.0141

0.8892

39.3758

0.0000

Intercept

t Stat

P-value

Predict Season Attendance = –2103.3507 + 0.0486 stadium capacity + 35.0141 mean capacity% The F test of the overall model reveals that the model is significant at the 0.05 significance level. Because FSTAT = 1077.967 or p-value = 0.000, reject H0. Individual t tests on the independent variables are significant at the 0.05 level, which suggests that the two independent variables in the model should be included. The adjusted r2 of this model is 0.9876. The r2 of 0.9876 indicates that 98.76% of the variation in used season attendance can be explained by the variation in stadium capacity and mean capacity percentage.

Copyright ©2024 Pearson Education, Inc.


xxviii Chapter 18: Getting Ready to Analyze Data in the Future 18.4 cont.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxix 18.4 cont.

The residual plots reveal no clear pattern in the residuals. However, the normal probability plot suggests that there is potential evidence of deviation from the normality assumption. One should consider models that use appropriate transformations.

Copyright ©2024 Pearson Education, Inc.


xxx Chapter 18: Getting Ready to Analyze Data in the Future 18.5

Because the goal of 18.5 is to predict the value of a dependent variable, price of used cars, one would use multiple regression. As a first step in the model building process, a review of the VIFs reveals that all variables have a VIF of less than 5.

A Best Subsets analysis reveals that all but one of the models have Cp values greater than k + 1, where k represents the number of independent variables. The model including all three independent variables has a Cp value equal to k + 1. This model, which also has the highest adjusted r2, represents the preferred model for further analyses. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxxi 18.5 cont.

The F test of the overall model reveals that the model is significant at the 0.05 significance level. Because FSTAT = 29.52 or p-value = 0.000, reject H0. Individual t tests on the independent variables are significant at the 0.05 level, which suggests that all 3 of the independent variables should be included. The adjusted r2 of this model is 0.3428. The r2 of 0.3549 indicates that 35.49% of the variation in used car price can be explained by the variation in age, mileage, and fuel mileage.

Copyright ©2024 Pearson Education, Inc.


xxxii Chapter 18: Getting Ready to Analyze Data in the Future

18.5 cont.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxxiii

Copyright ©2024 Pearson Education, Inc.


xxxiv Chapter 18: Getting Ready to Analyze Data in the Future 18.5 cont.

The residual plots reveal little to no pattern. However, the plots reveal several outliers. The normal probability plot suggest that there may be evidence of potential deviation from the normality assumption. 18.6

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxxv

18.6 cont.

The focus of the question in this problem was on whether there is evidence of gender bias in relationship to the evaluation of candidates. A multiple regression analysis was chosen in this case to determine whether the rater to candidate gender relationship is significantly predictive of recommended salary relative to other potential predictors. Let Y = Salary, X1 = Competence Rating, X2 = M to M (1 if Rater = M and Candidate = M, 0 otherwise), X3 = F to M (1 if Rater = F and Candidate = M, 0 otherwise), X4 = M to F (1 if Rater = M and Candidate = F, 0 otherwise), X5 = Public (1 if public, 0 otherwise), X6 = Biology (1 if Biology, 0 otherwise), X7 = Chemistry (1 if Chemistry, 0 otherwise), X8 = Age-Rater. Dummy variables were created for X2,3,4,5,6,7. For each of the factors, rater to candidate gender and department, the total number of categories minus one represented the number of dummy variables needed.

Copyright ©2024 Pearson Education, Inc.


xxxvi Chapter 18: Getting Ready to Analyze Data in the Future

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxxvii 18.6 cont.

A Best Subsets analysis reveals that several models have Cp values greater than k + 1, where k represents the number of independent variables. A number of models have Cp values equal to or less than k +1. The model including competency rating, the male to male rater to candidate relationship variable, the female to male rater to candidate relationship variable, and the male to female rater to candidate relationship variable had the lowest Cp value and the highest adjusted r2. This model represents a reasonable option for further analyses.

Copyright ©2024 Pearson Education, Inc.


xxxviii Chapter 18: Getting Ready to Analyze Data in the Future 18.6 cont.

The F test of the overall model reveals that the model is significant at the 0.05 significance level. Because FSTAT = 196.68 or p-value = 0.000, reject H0. Individual t tests on the independent variables are significant at the 0.05 level for all but the male to female rater to candidate relationship variable. For this variable, one would not reject H0 because the p-value = 0.069. This variable should be excluded from the model.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxxix 18.6 cont.

The F test of the overall model reveals that the model is significant at the 0.05 significance level. Because FSTAT = 255.88 or p-value = 0.000, reject H0. Individual t-tests on the independent variables are significant at the 0.05 level, which suggests that all 3 of the independent variables should be included. The adjusted r2 of this model is 0.8653. The r2 of 0.8687 indicates that 86.87% of the variation in salary recommendation can be explained by the variation in competence rating, male to male rater to candidate relationship, and female to male rater candidate relationship. The best model appears to be the following: Salary  17.314  2.982  Competence Rating   1.690  M to M   2.127  F to M  The above model was chosen based on the Best Subsets Approach. However, the same model would have been chosen by running a comprehensive regression analysis on all eight predictor variables and then choosing the coefficients that were significant at the 0.05 significance level. The Best Subsets Approach was chosen to illustrate the many different possible models that could be considered relative to the Cp value and the adjusted r2.

Copyright ©2024 Pearson Education, Inc.


xl Chapter 18: Getting Ready to Analyze Data in the Future 18.6 cont.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xli 18.6 cont.

Copyright ©2024 Pearson Education, Inc.


xlii Chapter 18: Getting Ready to Analyze Data in the Future 18.6 cont.

The residual plots suggest possible violation of the equal variance assumption. The conclusion from the regression result may not be reliable based on this possible violation. The normal probability plot suggests that the residuals are normally distributed except the 3 outliers in the right-tail.

After removing the three outliers, the normal probability plot suggest that the residuals are normally distributed.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xliii 18.6 cont.

The best regression model suggests that after taking into consideration several factors, the mean salary of a female candidate is estimated to be the same regardless of the gender of the rater. This conclusion is based on the finding that the gender of the rater was not predictive of the recommended salary for female candidates. In contrast, the mean salary of a male candidate is estimated to be $1.690 thousands higher than his female counterpart when rated by a male and estimated to be $2.1270 thousands higher than his female counterpart when rated by a female. The mean salary recommendation for female candidates was $27.60 thousand compared to $31.20 thousand for male candidates. Based on a two-sample t test, the difference is significant at the 0.05 significance level. The regression analyses go beyond understanding whether a difference exist in mean recommended salaries. These analyses directly address the question of whether there is gender bias in the evaluations. The above regression model indicates that there appears to be gender bias with both male and female raters recommending higher salaries for male candidates compared to female candidates.

18.7

The first part of the problem focuses on analyzing the data based on differences in cost among the various types of cuisines. Descriptive statistics followed by a One-Way ANOVA were performed to address this part of the problem. The descriptive statistics associated with cost for each of the different types of restaurants reveals that French restaurants have the highest mean cost and Mexican restaurants have the lowest mean cost. Although there appears to be some skewness for some of the restaurant categories, the normality assumption is difficult to assess given the relatively small sample sizes. The Levene test for difference in variances did not reveal any evidence of a significant violation of the equality of variance assumption across the cost of different types of cuisines.

Copyright ©2024 Pearson Education, Inc.


xliv Chapter 18: Getting Ready to Analyze Data in the Future 18.7 cont.

A one-way Anova F test for the equality of the mean reveals that there is sufficient evidence of differences in the cost of a meal for the different types of cuisines at the 5% level of significance. Because FSTAT = 14.21 or p-value = 0.000, reject H0.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xlv 18.7 cont.

The Tukey multiple comparisons show the pair-wise differences among the 7 different types of cuisines. At the higher cost category, the French, Japanese, and Italian restaurants are not significantly different from each other at the 5% level of significance. French cuisine is significantly higher in cost compared to American, Chinese, Indian, and Mexican cuisines. The Japanese and Italian cuisines are higher in cost compared to Chinese, Indian, and Mexican cuisines. At the lower cost category, the Chinese, Indian and Mexican cuisines are not significantly different from each other at the 5% level of significance. The second part of the problem focuses on developing a regression model to predict the cost of a meal based on the variables included in the dataset. This procedure will require dummy variables for all but one of the cuisine categories. The following variables represent potential predictors in the model: X1 = food rating, X2 = décor rating, X3 = service rating, X4 = popularity index, X5= 1 if American and 0 otherwise, X6 = 1 if Chinese and 0 otherwise, X7 = 1 if French and 0 otherwise, X8 = 1 if Indian and 0 otherwise, X9 = 1 if Italian and 0 otherwise, and X10 = 1 if Japanese and 0 otherwise.

Among all of the variables, service rating had the highest VIF above 5.

Copyright ©2024 Pearson Education, Inc.


xlvi Chapter 18: Getting Ready to Analyze Data in the Future 18.7 cont.

After dropping service rating from the model, food rating was the only variable with a VIF value above 5.

After removing the service rating and food rating variables, none of the remaining variables had VIF values above 5.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xlvii 18.7 cont.

A Best Subsets analysis reveals that several models have Cp values greater than k + 1, where k represents the number of independent variables. A number of models have Cp values equal to or less than k +1. The six variable model including décor, American cuisine, Chinese cuisine, French cuisine, Italian cuisine, and Japanese cuisine had the lowest Cp value and the highest adjusted r2. This model represents a reasonable option for further analyses.

Copyright ©2024 Pearson Education, Inc.


xlviii Chapter 18: Getting Ready to Analyze Data in the Future 18.7 cont.

This model included one variable, Chinese cuisine, that was not significant at the 0.05 significance level.

After removing the Chinese cuisine variable, all remaining variables are significant at the 0.05 significance level.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xlix 18.7 cont.

The F test of the overall model reveals that the model is significant at the 0.05 significance level. Because FSTAT = 28.97 or p-value = 0.000, reject H0. Individual t tests on the independent variables are significant at the 0.05 level, which suggests that all 5 of the independent variables should be included. The adjusted r2 of this model is 0.6860. The r2 of 0.7105 indicates that 71.05% of the variation in salary recommendation can be explained by the variation in décor rating, American, French, Italian, and Japanese cuisines. The most appropriate model for predicting the cost of a meal is: Cost =  8.1  2.924  Decor   18.65  American   38.55  French   22.96  Italian   28.11 Japanese 

Copyright ©2024 Pearson Education, Inc.


l Chapter 18: Getting Ready to Analyze Data in the Future 18.7 cont.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems li 18.7 cont.

Copyright ©2024 Pearson Education, Inc.


lii Chapter 18: Getting Ready to Analyze Data in the Future 18.7 cont.

The various residual plots against the independent variables suggest possible violation of the equality of variance assumption. The normal probability plot reveals possible departure from normality. The five-factor model including décor, American, French, Italian, and Japanese cuisines represent the best model for predicting the cost of a meal. This model is similar to the results from the ANOVA post-hoc comparison results that showed significant differences in the higher cost restaurants relative to the lower cost restaurants. 18.8

Please note that for Problem 18.8: ―male‖ represents ―M‖ or ―men‖ and ―female‖ represents ―W‖ or ―women‖ in ―Gender‖

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems liii 18.8 cont.

Descriptive Statistics for Bank Churn Study

A descriptive analysis of all variables is provided above. The problem focuses on determining the likelihood that a customer will leave the bank. It this case, 260 out of 1,000 customers had left the bank. The next step is to develop a model that predicts the likelihood that a customer would leave the bank.

Because the dependent variable in this case is categorical, whether a customer has left the bank, a logistic regression would be the appropriate regression procedure. The above coefficients table was created from an initial analysis of all potential predictor variables. It was necessary to create dummy variables for gender, and domicile location. A series of regression analyses was performed by removing variables that did not significantly predict whether a customer left the bank. Copyright ©2024 Pearson Education, Inc.


liv Chapter 18: Getting Ready to Analyze Data in the Future 18.8 cont.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lv 18.8 cont.

The above model predicts the likelihood a customer will leave the bank based on the following four variables: gender, age, whether a customer is active, and whether a customer lives in Germany. All other variables were not significant at the 0.05 significance level. Holding constant the effects of age, active membership, and whether one lives in Germany, ln (odds) decreases by 0.515 if the customer is male. Holding constant gender, whether a customer is active, and whether a customer lives in Germany, ln (odds) increases by 0.07009 each year in age. Holding constant the effects of gender, age, and whether a customer lives in Germany, ln (odds) decreases by 0.843 if a customer is an active member. Holding constant the effects of gender, age, and active membership, ln (odds) increases by 0.572 if a customer lives in Germany. The deviance statistic is 144.72 is well below the critical value of  2 . In this case, one would not reject H0. There is insufficient evidence that the model is not good fitting. However, the results should be interpreted with caution given the large number of degrees of freedom. Using the above four variable model, one could predict the likelihood that a customer would leave the bank based on specific values for each of the four variables. For example, suppose one wanted to determine the estimated probability of leaving the bank for customers that were female, 50 years of age, a non-active bank member, and a resident of Germany.

For this example, the estimated odds ratio = 0.667854. The model would predict that 66.8% of customers who are women W (female), age 50, a non-active bank member, and reside in Germany would exit the bank. In contrast to the above example, suppose one wanted to determine the estimated probability of leaving the bank for customers that were men M (male), 25 years of age, an active bank member, and not a resident of Germany.

Copyright ©2024 Pearson Education, Inc.


lvi Chapter 18: Getting Ready to Analyze Data in the Future 18.8 cont.

For this example, the estimated odds ratio = 0.0481653. The model would predict that 4.8% of customers who are men M (male), age 25, an active bank member, and not living in Germany would exit the bank. Results from other statistical tests of the data are consistent with the above logistic regression analysis. For example, Chi-Square tests for association revealed the following: a significantly higher percentage of women W (females) had exited the bank, a significantly higher percentage of non-active bank members exited the bank, and a significantly higher percentage of customers from Germany exited the bank. A two-sample t test revealed that the mean age of customers was significantly higher among individuals that had exited the bank. Examining the data in various ways can be helpful in assessing the interpretation of the logistic regression model above.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lvii 18.9

A descriptive analysis of the various variables revealed considerable right skewness for a number of potential predictor variables. This should be taken into consideration when interpreting the results from subsequent regression analyses. The first part of the problem focuses on presenting conclusions in regard to the relationship between the amount stacked and the various potential types of downtime causes. Two approaches to model building were used to identify an appropriate multiple regression model to understand the variation in amount stacked in relation to the various downtime factors.

Copyright ©2024 Pearson Education, Inc.


lviii Chapter 18: Getting Ready to Analyze Data in the Future 18.9 cont.

For the above analysis, the following variables were utilized: Y = amount stacked, X1 = mechanical, X2 = Electrical, X3 = Tonnage Restriction, X4 = Operator, and X5 = No Feed. None of the independent variables have a VIF > 5.0. Hence, there is not any concern of collinearity among the independent variables. Because the t-test for the significant of individual independent variable reveals that tonnage restriction has a tSTAT = 0.01 with a p-value = 0.992, do not reject H0. There is not enough evidence that tonnage restriction is significant at the 5% level and should be removed.

After removing the tonnage restriction variable, the F test of the overall model reveals that the model is significant at the 0.05 significance level. Because FSTAT = 76.08 or p-value = 0.000, reject H0. Individual t tests on the independent variables are significant at the 0.05 level, which suggests that all 4 of the independent variables should be included. The adjusted r2 of this model is .8983. The r2 of 0.9103 indicates that 91.03% of the variation in amount stacked can be explained by the variation in mechanical, electrical, operator, and no feed variables. The same model is identified as the preferred model for predicting amount stacked using the Best Subsets approach.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lix 18.9 cont.

The Best Subsets approach also revealed the four variable model consisting of the mechanical, electrical, operator, and no feed variables. This model had the lowest Cp value and the highest adjusted r2. Both model building approaches led to the same 4 variable model.

Copyright ©2024 Pearson Education, Inc.


lx Chapter 18: Getting Ready to Analyze Data in the Future 18.9 cont.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxi 18.9 cont.

Copyright ©2024 Pearson Education, Inc.


lxii Chapter 18: Getting Ready to Analyze Data in the Future 18.9 cont.

The residual plots revealed no clear pattern, suggesting that there is insufficient evidence for violations in equal variance and linear assumptions. The normal probability plot revealed no evidence for a violation of the normality assumption. The second part of the problem focuses on developing a model to predict amount stacked based on total downtime.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxiii 18.9 cont.

As expected, the daily amount stacked and downtime appear to be negatively related with a correlation coefficient r = 0.9410 with a p-value of 0.000. There is strong evidence that the daily amount stacked and downtime are negatively related.

The F test of the model reveals that the model is significant at the 0.05 significance level. Because FSTAT = 255.15 or p-value = 0.000, reject H0. For downtime, tSTAT = -15.97 with a p-value of 0.000. In this case, one would reject H0. The r2 of 0.8855 indicates that 88.55% of the variation in amount stacked can be explained by the variation in downtime. Copyright ©2024 Pearson Education, Inc.


lxiv Chapter 18: Getting Ready to Analyze Data in the Future 18.9

To predict daily amount stacked, the model to use is:

cont.

Y  36760  28.24X where X is the downtime.

18.10

Please note that for Problem 18.10: ―male‖ represents ―M‖ or ―man‖ and ―female‖ represents ―W‖ or ―woman‖ in ―Gender‖

The above descriptive analysis represents a starting point for understanding the data associated with customers of Wally’s Discount Stores. One could conduct detailed descriptive analyses to examine the characteristics and demographics of these customers. Although the problem did not provide a specific direction for an analysis, it would be reasonable to assume that the owners may want to understand the factors that are predictive of the amount of its private-label, ShowGo, purchases. The below analysis will identify the preferred model for predicting ShowGo purchases among customers of Wally’s Discount Stores.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxv 18.10 cont.

A multiple regression analysis including all potential predictor variables revealed that variables had VIF values well below five. However, several predictor variables were not significant at the 0.05 significance level.

Copyright ©2024 Pearson Education, Inc.


lxvi Chapter 18: Getting Ready to Analyze Data in the Future 18.10 cont.

After removing all non-significant predictor variables, a two variable model was identified that consisted of whether one owned a Wallys Card and age of customer. Although both variables had tSTAT values that were significant, the overall model was very weak in predicting ShowGo purchases. The The r2 of 0.0299 indicates that 2.99% of the variation ShowGo purchases can be explained by the variation in whether one owns a Wallys Card and age of customer. Although the model had very little predictive value, the findings do suggest that the owners of Wall’s Discount Stores may want to consider collecting other data if they desire to identify predictors of ShowGo label. This finding is a reminder that it can be helpful to understand which variables are not predictive as well which variables are predictive. As an example, scatterplots of the numeric variables healthy eating rating and active lifestyle rating clearly show there is no relationship between these variables and ShowGo purchases.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxvii 18.10 cont.

Other variables would need to be identified to develop an appropriate model for predicting ShowGo purchases among customers of Wally’s Discount Stores.

Copyright ©2024 Pearson Education, Inc.


lxviii Chapter 18: Getting Ready to Analyze Data in the Future 18.11

Because the problem focuses on predicting the number of domestic and imported hybrid vehicles sold in 2019 and 2020, it is necessary to build an appropriate regression model. The time series plot reveals an overall upward trend with cyclical components in the number of domestic and imported hybrids sold in the United States from 1999 to 2018. Linear Model:

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxix 18.11 cont.

A linear trend model with coded year as the independent variable reveals an upward linear trend with a r2 of 0.7521, which indicates that 75.21% of the variation in hybrid sales can be explained by the linear trend of the time series. Because tSTAT = 7.39 or p-value = 0.000, reject H0. Quadratic Model:

Because tSTAT = -3.61 or p-value = 0.002, reject H0. The quadratic model adds significantly in predicting hybrid sales. r2 of 0.8597, which indicates that 85.97% of the variation in hybrid sales can be explained by the linear and quadratic trends of the time series.

Copyright ©2024 Pearson Education, Inc.


lxx Chapter 18: Getting Ready to Analyze Data in the Future 18.11 cont.

Exponential Model:

Because tSTAT = 4.10 or p-value = 0.001, reject H0. The exponential model is useful in predicting the sales of hybrids. However, its r2 value of 0.4826 is much lower than the r2 associated with the linear and quadratic models.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxxi 18.11 cont.

Autoregressive Third-Order Model:

Because tSTAT = 0.33 or p-value = 0.745, do not reject H0. The third-order term can be removed from the model.

Copyright ©2024 Pearson Education, Inc.


lxxii Chapter 18: Getting Ready to Analyze Data in the Future 18.11 cont.

Autoregressive Second-Order Model:

Because tSTAT = -1.13 or p-value = 0.275, do not reject H0. The second-order term can be removed from the model.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxxiii 18.11 cont.

Autoregressive First-Order Model:

Because tSTAT = 10.10 or p-value = 0.000, reject H0. The first-order autoregressive model is appropriate. The adjusted r2 of 0.8487 for this model was higher than all other models. Linear

Copyright ©2024 Pearson Education, Inc.


lxxiv Chapter 18: Getting Ready to Analyze Data in the Future

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxxv 18.11 cont.

Quadratic

Exponential

Copyright ©2024 Pearson Education, Inc.


lxxvi Chapter 18: Getting Ready to Analyze Data in the Future 18.11 cont.

First-Order Autoregressive

The residual plots for the linear, quadratic, and exponential models show clear patterns. The FirstOrder Autoregressive Model residual plot shows no clear pattern, which indicates the model fits adequately. Based on regression results and the residual plots, the First-Order Autoregressive Model represents the best model to predict hybrid sales. The first-order autoregressive model also had the smallest values for the standard error of the estimate and MAD. Utilizing the first-order autoregressive model, the predicted number of domestic and imported hybrid vehicles sold in the U.S. in 2019 and 2020 would be as follows:

Yˆ2019  49388  0.8676(323912)  370981 Yˆ  49388  0.8676(370981)  371238 2020

Copyright ©2024 Pearson Education, Inc.


Chapter 19 (Online)

(a)

Proportion of nonconformances largest on Day 5, smallest on Day 3.

Proportion

19.1

0.3 0.2 0.1 0 0

2

4

6

8

10

Day

(b)

n = 100, p = 1.48/10 = 0.148,

p (1  p ) 0.148(1  0.148)  0.148  3  0.04147 , n 100 p (1  p ) 0.148(1  0.148) UCL  p  3  0.148  3  0.25453 n 100 Proportions are within control limits, so there does not appear to be any special causes of variation. LCL  p  3

(c)

(a)

Proportion of nonconformances largest on Day 4, smallest on Day 3.

Proportion

19.2

0.3 0.2 0.1 0 0

2

4

6

8

10

Day

(b)

n = 1036/10 = 103.6, p = 148/1036 = 0.142857,

p (1  p ) 0.142857(1  0.142857)  0.142857  3  0.039719 n 103.6 p (1  p ) 0.142857(1  0.142857) UCL  p  3  0.142857  3  0.245995 n 103.6 LCL  p  3

Copyright ©2024 Pearson Education, Inc. v


vi Chapter 19: Statistical Applications in Quality Management (online) (c)

Proportions are within control limits, so there do not appear to be any special causes of variation.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems vii 19.3

(a)

n = 125, p = 146/3875 = 0.0377,

p (1  p ) 0.0377(1  0.0377)  0.0377  3  0.0134 < 0, so the lower n 125 control limit does not exist. p (1  p ) 0.0377(1  0.0377) UCL  p  3  0.0377  3  0.0888 n 125 LCL  p  3

19.4

(b)

The proportion of transmissions with errors on Day 23 is substantially out of control. Possible causes of this value should be investigated.

(a)

n = 500, p = 761/16000 = 0.0476

p (1  p ) 0.0476(1  0.0476)  0.0476  3  0.0190 > 0 n 500 p (1  p ) 0.0476(1  0.0476) UCL  p  3  0.0476  3  0.0761 n 500 LCL  p  3

(b)

Since the individual points are distributed around p without any pattern and all the points are within the control limits, the process is in a state of statistical control. Copyright ©2024 Pearson Education, Inc.


viii Chapter 19: Statistical Applications in Quality Management (online) 19.5

(a)

n = 102.5667, p = 0.308742, LCL = 0.171895, UCL = 0.44559

PHStat output:

(b)

19.6

(c)

Yes, the process gives an out of control signal because the proportions fall outside of the control limits on four of the 30 days. n = 103.1923, p = 0.297428, LCL = 0.162428, UCL = 0.432428

(a)

n = 113345/22 = 5152.0455, p = 1460/113345 = 0.01288,

p (1  p ) 0.01288(1  0.01288)  0.01288  3  0.00817 n 5152.0455 p (1  p ) 0.01288(1  0.01288) UCL  p  3  0.01288+3  0.01759 n 5152.0455 PHStat output: LCL  p  3

The proportion of unacceptable cans is below the LCL on Day 4. There is evidence of a pattern over time, since the last eight points are all above the mean and most of the earlier points are below the mean. Thus, the special causes that might be contributing to this pattern should be investigated before any change in the system of operation is contemplated. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ix 19.6 cont.

(b)

Once special causes have been eliminated and the process is stable, Deming’s fourteen points should be implemented to improve the system. They might also look at day 4 to see if they could identify and exploit the special cause that led to such a low proportion of defects on that day.

19.7

(a)

p = 0.042, UCL = 0.085, LCL does not exist. Points 9, 16, 20, 31, and 36 are above the UCL. First, the reasons for the special cause variation would need to be determined and local corrective action taken. Once special causes have been eliminated and the process is stable, Deming’s fourteen points should be implemented to improve the system.

(b)

(a)

(b)

19.11

(a)

p = 0.1091, LCL = 0.0751, UCL = 0.1431. Points 9, 26, and 30 are above the UCL. First, the reasons for the special cause variation would need to be determined and local corrective action taken. Once special causes have been eliminated and the process is stable, Deming’s fourteen points should be implemented to improve the system. c = 38/10 = 3.8, LCL  c  3 c  3.8  3 3.8  0 , LCL does not exist.

UCL  c  3 c  3.8  3 3.8  9.648077

Noncomformance s

19.8

15 10 5 0 0

2

4

6

8

Time

Copyright ©2024 Pearson Education, Inc.

10


x Chapter 19: Statistical Applications in Quality Management (online) 19.11 cont.

(b)

There do not appear to be special causes of variation, as there are no points outside the control limits and no discernable pattern.

19.12

(a)

c = 115/10 = 11.5, LCL  c  3 c  11.5  3 11.5  1.32651

Noncomformance s

UCL  c  3 c  11.5  3 11.5  21.67349

30 20 10 0 0

2

4

6

8

10

Time

19.13

(b)

Yes, the number of nonconformances per unit for Time Period 1 is above the upper control limit.

(a)

c = 155/24 = 6.458,

LCL  c  3 c  6.458  3 6.458  0 , LCL does not exist.

Noncomformances

UCL  c  3 c  6.458  3 6.458  14.082 The process appears to be in control since there are no points outside the upper control limit and there is no pattern in the results over time.

15 10 5 0 0

5

10

15

20

Day

(b)

(c)

19.14

(a)

The value of 12 is within the control limits, so that it should be identified as a source of common cause variation. Thus, no action should be taken concerning this value. If the value were 20 instead of 12, c would be 6.792 and UCL would be 14.61. In this situation, a value of 20 would be substantially above the UCL and action should be taken to explain the special cause of variation. The process needs to be studied and potentially changed using principles of Six Sigma® management and/or Deming’s 14 points for management. The twelve errors committed by Gina appear to be much higher than all others, and Gina would need to explain her performance. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xi 19.14 cont.

(b)

(c) (d)

19.15

c = 5.5, UCL = 12.56, LCL does not exist. The number of errors is in a state of statistical control since none of the tellers are outside the UCL. Since Gina is within the control limits, she is operating within the system, and should not be singled out for further scrutiny. The process needs to be studied and potentially changed using principles of Six Sigma® management and/or Deming’s 14 points for management.

(a)

(b) (c)

c = 5.07, UCL = 11.83, LCL does not exist. There are no points above the UCL. However, the first nine points are all below the center line. Data collection would be delayed until the startup period for the unit had passed. For example, the severity of the illness or the age of the patients in the unit from month to month could be contributing factors. Copyright ©2024 Pearson Education, Inc.


xii Chapter 19: Statistical Applications in Quality Management (online) 19.16

(a) (b)

c  3.057

(c)

There is evidence of a pattern over time, since the first eight points are all below the mean. Thus, the special causes that might be contributing to this pattern should be investigated before any change in the system of operation is contemplated. Even though weeks 15 and 41 experienced seven fire runs each, they are both below the upper control limit. They can, therefore, be explained by chance causes. After having identified the special causes that might have contributed to the first eight points that are below the average, the fire department can use the c-chart to monitor the process in future weeks in real-time and identify any potential special causes of variation that might have arisen and could be attributed to increased arson, severe drought, or holiday-related activities.

(d) (e)

19.17

(a)

(b) (c)

The number of unsafe acts observed during the ninth tour is above the upper control limit. The process is out of control. The special causes that might have contributed to the extraordinary number of unsafe acts during the ninth tour should be investigated and corrected to improve the process.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xiii 19.18

(a) (b) (c)

d2 = 2.059(d)D4 = 2.282 d3 = 0.88(e)A2 = 0.729 D3 = 0

19.19

(a) (b) (c)

d2 = 1.693(d)D4 = 2.575 d3 = 0.888(e)A2 = 1.023 D3 = 0

19.20

(a)

R = 0.247, R chart: UCL = 0.636; LCL does not exist

(b)

According to the R-charts, the process appears to be in control with all points lying inside the control limits without any pattern and no evidence of special cause variation.

(c)

X = 47.998, X chart: UCL = 48.2507; LCL = 47.7453

(d)

According to the X -chart, the process appears to be in control with all points lying inside the control limits without any pattern and no evidence of special cause variation. Copyright ©2024 Pearson Education, Inc.


xiv Chapter 19: Statistical Applications in Quality Management (online)

19.21

(a)

(b) (c)

R = 3.97 LCL = D3 R = 0 (3.97) = 0. LCL does not exist. UCL = D4 R = (2.282) (3.97) = 9.05954 There are no sample ranges outside the control limits and there does not appear to be a pattern.

X = 13.95 LCL = X – A2 R = 13.95 – (0.729) (3.97) = 11.05587

(d)

UCL = X + A2 R = 13.95 + (0.729) (3.97) = 16.84413 The sample mean on Day 7 is above the UCL, which is an indication there is evidence of special cause variation. k

 Ri 19.22

(a)

R =

i 1

k

k

X

= 3.275, X = i 1 k

i

= 5.9413.

R chart: UCL = D4 R  2.282  3.275 = 7.4736 LCL does not exist. X chart: UCL = X  A2 R  5.9413  0.729  3.275 = 8.3287 LCL = X  A2 R  5.9413  0.729  3.275 = 3.5538 PHStat R Chart output:

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xv 19.22 cont.

(a)

(b)

19.23

(a)

PHStat X Chart output:

The process appears to be in control since there are no points outside the control limits and there is no evidence of a pattern in the range chart, and there are no points outside the control limits and there is no evidence of a pattern in the X chart.

R = 271.57, UCL = D4 R = (2.114) (271.57) = 574.09 LCL = D3 R = 0 (271.57) = 0. LCL does not exist.

X =198.67 LCL = X – A2 R = 198.67 – (0.577) (271.57) = 41.97 UCL = X + A2 R = 198.67 + (0.577) (271.57) = 355.36

Copyright ©2024 Pearson Education, Inc.


xvi Chapter 19: Statistical Applications in Quality Management (online) 19.23 cont.

(a)

(b)

19.24

(a)

The process appears to be in control since there are no points outside the control limits and no evidence of a pattern in the range chart, and there are no points outside the control limits and no evidence of a pattern in the X chart.

R = 0.8794, R chart: UCL = 2.0068; LCL does not exist R Chart 2.5 UCL

2 1.5 1

RBar

0.5 0

LCL 0

10

20

30

X = 20.1065, X chart: UCL = 20.7476; LCL = 18.4654

Copyright ©2024 Pearson Education, Inc.

40


Solutions to End-of-Section and Chapter Review Problems xvii XBar Chart 21 20.8 20.6 20.4 20.2 20 19.8 19.6 19.4

UCL

XBar

LCL 0

19.24 cont.

(c)

19.25

(a)

5

10

15

20

25

30

35

The process appears to be in control since there are no points outside the lower and upper control limits of either the R-chart and Xbar-chart, and there is no pattern in the results over time.

(b)

(c)

19.26

According to both charts, the process appears to be in control with all points lying inside the control limits and no evidence of any pattern.

(a)

Copyright ©2024 Pearson Education, Inc.


xviii Chapter 19: Statistical Applications in Quality Management (online)

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xix 19.26 cont.

(a)

R = 8.145, X = 18.12. R chart: LCL = D3 R = 0 (8.145) = 0. LCL does not exist. UCL = D4 R = (2.282) (8.145) = 18.58689. For X chart: LCL = X – A2 R = 18.12 – (0.729) (8.145) = 12.1823 (b)

19.27

(a)

(b)

UCL = X + A2 R = 18.12 + (0.729) (8.145) = 24.0577 There are no sample ranges outside the control limits and there does not appear to be a pattern in the range chart. The sample mean on Day 15 is above the UCL and the sample mean on Day 16 is below the LCL, which is an indication there is evidence of special cause variation in the sample means. Some possible sources of common cause variation can be the fluctuation in temperature, humidity and geographical disturbances of the environment in which the machines operate. A machine that operates in an earthquake zone could experience chance variation during even undetectable quakes. If a machine operates in a location near a subway line, the vibration from the transit trains can cause systematic assignable cause of variation.

(c) R Chart 0.6 UCL

0.5 0.4 0.3

RBar

0.2 0.1 0 0

5

10

15

20

LCL 25

Copyright ©2024 Pearson Education, Inc.

30


xx Chapter 19: Statistical Applications in Quality Management (online) 19.27 cont.

19.28

(c)

(d)

The process appears to be in control since none of the points fall outside the control limits and there is no evidence of any pattern.

(a)

R = 0.3022, R chart: UCL = 0.6389; LCL does not exist

X = 90.1317, X chart: UCL = 90.3060; LCL = 89.9573 R Chart 1.2 1 0.8 UCL

0.6 0.4

RBar 0.2 0 0

5

10

Copyright ©2024 Pearson Education, Inc.

15

LCL 20


Solutions to End-of-Section and Chapter Review Problems xxi 19.28 cont.

(a)

XBar Chart 91 90.8 90.6 90.4

UCL

90.2

XBar

90

LCL

89.8 89.6 0

19.29

19.30

5

10

15

20

(b)

The R-chart is out-of-control because the 5th and 6th data points fall above the upper control limit. There is also a downward trend in the right tail of the R-chart, which signifies that special causes of variation must be identified and corrected. Even though the X-bar chart also appears to be out-of-control because a majority of the data point fall above or below the control limit, any interpretation will be misleading because the R-chart has indicated the presence of out-of-control conditions. There is also a downward trend in both control charts. Special causes of variation should be investigated and eliminated.

(a)

Estimate of the population mean of all X values = X  20

(b)

Estimate of the population standard deviation of all X values = R / d 2 

(a)

Estimate of the population mean = X  100 Estimate of population standard deviation = R / d 2 

(b) (c) (d)

3.386 2 1.693

102  100   98  100 P  98  X  102   P  Z   0.6827 2 2   107.5  100   93  100 P  93  X  107.5  P  Z    .9997 2 2   93.8  100   P  X  93.8   P  Z    .9990 2   110  100   P  X  110   P  Z   1 2  

Copyright ©2024 Pearson Education, Inc.

2  0.9713 2.059


xxii Chapter 19: Statistical Applications in Quality Management (online) 19.31

(a)

Cp 

USL  LSL 102  98   0.3333 6  2 6  R / d2 

CPL 

X  LSL 100  98   0.3333 3 2 3 R / d2  USL  X 102  100   0.3333 3 2 3 R / d2 

CPU 

C pk  min(CPL, CPU )  0.3333

(b)

Cp 

USL  LSL 107.5  93   1.2083 6  2 6  R / d2 

CPL 

X  LSL 100  93   1.1667 3 2 3 R / d2 

CPU 

USL  X 107.5  100   1.25 3 2 3 R / d2 

C pk  min(CPL, CPU )  1.1667

19.32

(a)

(b)

22  20.1065   18  20.1065 P 18  X  22   P  Z  0.8794 / 2.059   0.8794 / 2.059  P  4.932  Z  4.4335  0.9999

Cp 

 22  18 (USL  LSL)   1.56 6  0.8794 / 2.059  6  R / d2 

CPL 

 X  LSL    20.1065  18  1.644

CPU 

3 R / d2 

3  0.8794 / 2.059 

3 R / d2 

3  0.8794 / 2.059 

USL  X    22  20.1065  1.4778

C pk  min(CPL, CPU )  1.4778

19.33

(a)

Estimate of the population mean = X  15.85

2.272  1.342 1.693  13  15.85   Z   0.9832 Percentage within specification = P 13  X   P   1.342  Estimate of population standard deviation = R / d 2 

(b)

CPL 

 X  LSL   15.85  13  0.7073 3 R / d2 

3  2.272 / 1.693

C pk  min(CPL, CPU )  0.7073

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxiii 19.34

(a)

(b)

19.35

(a)

5.8  5.509   5.2  5.509 P  5.2  X  5.8  P  Z  0.2248 / 2.059 0.2248 / 2.059    P  2.830  Z  2.665  0.9938 According to the estimate in (a), only 99.38% of the tea bags will have weight fall between 5.2 grams and 5.8 grams. The process is, therefore, incapable of meeting the 99.7% goal. The estimated percentage of the waiting times that are inside the specification limits is 5  5.9413   P  X  5  P  Z    0.2773 3.275/2.059  

(b)

The process is not capable of reaching the goal with a 99% requirement and will not be capable with a even more stringent criterion of 99.7%.

19.36

Chance or common causes of variation represent the inherent variability that exists in a system. These consist of the numerous small causes of variability that operate randomly or by chance. Special or assignable causes of variation represent large fluctuations or patterns in the data that are not inherent to a process. These fluctuations are often caused by changes in a system that represent either problems to be fixed or opportunities to exploit.

19.37

Find the reasons for the special causes and take corrective action to prevent their occurrence in the future or exploit them if they improve the process.

19.38

When only common causes of variation are present, it is up to management to change the system.

19.39

The p chart is an attribute control chart. It can be used when sampled items are classified according to whether they conform or do not conform to operationally defined requirements. It is based on the proportion of nonconforming items in a sample.

19.40

Attribute control charts are used for categorical or discrete data such as the number of nonconformances. Variables control charts are used for numerical variables and are based on statistics such as the mean and standard deviation.

19.41

Since the range is used to obtain the control limits of the chart for the mean, the range needs to be in a state of statistical control. Thus, the range and mean charts are used together.

19.42

From the red bead experiment you learned that variation is an inherent part of any process, that workers work within a system over which they have little control, that it is the system that primarily determines their performance, and that only management can change the system.

19.43

Process potential measures the potential of a process in satisfying production specification limits or customer satisfaction but does not take into account the actual performance of the process; process performance refers to the actual performance of the process in satisfying production specification limits.

19.44

If a process has a Cp = 1.5 and a Cpk = 0.8, it indicates that the process has the potential of meeting production specification limits but fails to meet the specification limits in actual performance. The process should be investigated and adjusted to increase either the CPU or CPL or both. Copyright ©2024 Pearson Education, Inc.


xxiv Chapter 19: Statistical Applications in Quality Management (online) 19.45

Capability analysis is not performed on out-of-control processes because out-of-control processes do not allow one to predict their capability. They are considered incapable of meeting specifications and, therefore, incapable of satisfying the production requirement.

19.46

(a)

(b)

(c) (d) 19.47

(a) (b)

(c)

(d)

(e) (f)

One the main reason that service quality is lower than product quality is because the former involves human interaction which is prone to variation. Also, the most critical aspects of a service are often timeliness and professionalism, and customers can always perceive that the service could be done quicker and with greater professionalism. For products, customers often cannot perceive a better or more ideal product than the one they are getting. For example, a new laptop is better and contains more interesting features than any laptop that he or she has ever imagined. Both services and products are the results of processes. However, measuring services is often harder because of the dynamic variation due to the human interaction between the service provider and the customer. Product quality is often a straightforward measurement of a static physical characteristic like the amount of sugar in a can of soda. Categorical data are also more common in service quality. Yes. Yes. A question like ―Do you find the restrooms to be clean?‖ will provide responses that will allow you to construct a p-chart. An example of common-cause variation is the inherent fluctuation in the proportion of customers who view the restroom to be clean due to natural fluctuation in the number of visitors during different periods of the day. An example of a special cause variation is large fluctuation in the proportion that might be caused by malfunctioning of the plumbing system in some restrooms. If the control chart is in control, you need to determine if the amount of common-cause variation is small enough to satisfy the customers. If the common-cause variation is small enough to satisfy the customers, you can use the control chart to monitor the process on a continuing basis to make sure that the process remains in control. If the common-cause variation is too large, you need to alter the process to reduce the size of the common-cause variation. If the control chart is out of control, you need to identify the special causes of variation that are producing the out of control conditions. If the special cause of variation is detrimental to the quality, you need to implement plans to eliminate this source of variation. If the special cause of variation increases quality, you should change the process so that this special cause is incorporated into the process design. A question like ―Do you intend to return for another visit in the future?‖ will provide responses that will allow you to construct a p-chart. Answers to (b) – (d) are the same. Continue to chart daily responses from random sampling.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxv 19.48

(a)

R = 0.1284, R chart: UCL = 0.33063; LCL does not exist R-chart Boston Shingles 0.35

UCL 0.3 0.25 0.2 0.15

RBar 0.1 0.05

LCL

0 0

(b)

5

10

15

20

25

30

X = 1.2133, X chart: UCL = 1.3447; LCL = 1.0820 Xbar Chart Boston Shingles 1.5 1.4 UCL 1.3 1.2

XBar

1.1

LCL

1 0.9 0

(c)

(d)

5

10

15

20

25

30

There is one point above the UCL and one point below the LCL in the control chart for the mean sealant strength. So the process is out-of-control and the process should be investigated for special causes. (a) R Chart Vermont 0.35 UCL

0.3 0.25 0.2 0.15

RBar 0.1 0.05 0 0

5

10

Copyright ©2024 Pearson Education, Inc.

15

LCL 20


xxvi Chapter 19: Statistical Applications in Quality Management (online) 19.48 cont.

(d)

(b) Xbar Chart Vermont 1.4 UCL

1.35 1.3 1.25

XBar 1.2 1.15 1.1

LCL

1.05 1 0

5

10

15

20

(c) Since no point falls outside the upper and lower control limit of either chart, the process appears to be in-control. 19.49

(a)

(b) (c)

p = 0.75175, LCL = 0.62215, UCL = 0.88135. Although none of the points are outside either the LCL or UCL, there is a clear pattern over time with lower values occurring in the first half of the sequence and higher values occurring toward the end of the sequence. This would explain the pattern in the results over time. The control chart would have been developed using the first 20 days and then, using those limits, the additional proportions could have been plotted.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxvii 19.50

(a)

(b) (c)

19.51

(a) (b) (c)

19.52

p = 0.391, LCL = 0.301, UCL = 0.480. The process is out of statistical control. The proportion of investigations that are closed is below the LCL on Days 2 and 16 and are above the UCL on Days 22 and 23. Special causes of variation should be investigated and eliminated. Next, process knowledge should be improved to increase the proportion of investigations closed the same day. When the proportions are above the UCL, this is in the direction the process needs to go. Therefore, when investigating special causes for days 22 and 23, consideration of exploiting these special causes should be given. p = 0.1198, LCL = 0.0205, UCL = 0.2191. The process is out of statistical control. The proportion of trades that are undesirable is below the LCL on Day 24 and are above the UCL on Day 4. Special causes of variation should be investigated and eliminated. Next, process knowledge should be improved to decrease the proportion of trades that are undesirable.

Processing time: Control chart for the range:

Copyright ©2024 Pearson Education, Inc.


xxviii Chapter 19: Statistical Applications in Quality Management (online) 19.52 cont.

R = 3.597, LCL = .802, UCL = 6.392 Control chart for the mean:

X =2.2653, LCL = 1.1575, UCL = 3.3732 Processing time can be considered to be in control in terms of the mean and the range since there is no strong pattern in either chart and there are no points that are outside the control limits. Proportion of rework in the laboratory:

p = 0.04737, LCL = 0.02721, UCL = .06752 Days 6 and 29 are above the UCL. Thus, the special causes that might be contributing to these values should be investigated before any change in the system of operation is contemplated. Then Deming's fourteen points can be applied to improve the system.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxix 19.52 cont.

Number of daily admissions:

c = 26.8, LCL = 11.2694, UCL = 42.3306 The number of daily admissions can be considered to be in control since there is no strong pattern in the chart and there are no points that are outside the control limits. 19.53

Kidney- Shift 1

Copyright ©2024 Pearson Education, Inc.


xxx Chapter 19: Statistical Applications in Quality Management (online) 19.53 cont.

PHStat output:

Although there are no points outside the control limits, there is a strong increasing trend in nonconformances over time. Kidney- Shift 2

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxxi 19.53 cont.

Although there are no points outside the control limits, there is a strong increasing trend in nonconformances over time. Shift 1 Shrimp

There are no points outside the control limits and there is no pattern over time. Copyright ©2024 Pearson Education, Inc.


xxxii Chapter 19: Statistical Applications in Quality Management (online) 19.53 cont.

Shift 2 Shrimp

There are no points outside the control limits and there is no pattern over time. The team needs to determine the reasons for the increase in nonconformances for the kidney product. The production volume for kidney is clearly decreasing for both shifts. This can be observed from a plot of the production volume over time. The team needs to investigate the reasons for this.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxxiii 19.54

Shift 1 - Kidney

The R-chart shows a process in a state of statistical control, with the individual points distributed around the average range of weight without any pattern and all the points are within the control limits. Thus, any improvement in the process must come from the reduction of common cause variation. Such reductions require changes in the process. These changes are the responsibility of management since improvements in quality cannot occur until changes to the process itself are successfully implemented.

From the Xbar-chart, you can see that the eighth interval is below the LCL. Management needs to determine the root cause for this special cause variation and take corrective action. Also, during the first half of the sequence almost all the 15-minute intervals had less than the mean subgroup average weight, and all the 15-minute intervals in the second half had more than the mean subgroup average weight. This is a degradation in the production capability which is due to some special cause of variation. So the special cause that produced this pattern needs to be determined. When identified, action will be needed to correct this special cause.

Copyright ©2024 Pearson Education, Inc.


xxxiv Chapter 19: Statistical Applications in Quality Management (online) 19.54 cont.

Shift 2 – Kidney

The R-chart reveals that the 25th interval is above the UCL. Also, during the first half of the sequence almost all the 15-minute intervals had less than the mean subgroup weight range, and all the 15-minute intervals in the second half had more than the mean subgroup weight range. This is a degradation in the production capability which is due to some special cause of variation. So the special cause that produced this pattern needs to be determined. Management needs to determine the root cause for this special cause variation and take corrective action. Since the R-chart indicates an out-of-control process, the interpretation of the chart for the mean will be misleading. Shift 1 – Shrimp

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxxv 19.54 cont.

The R-chart shows a process in a state of statistical control, with the individual points distributed around the average range of weight without any pattern and all the points are within the control limits. Thus, any improvement in the process must come from the reduction of common cause variation. Such reductions require changes in the process. These changes are the responsibility of management since improvements in quality cannot occur until changes to the process itself are successfully implemented.

The Xbar-chart shows a process in a state of statistical control, with the individual points distributed around the average of the subgroup means without any pattern and all the points are within the control limits. Thus, any improvement in the process must come from the reduction of common cause variation. Such reductions require changes in the process. These changes are the responsibility of management since improvements in quality cannot occur until changes to the process itself are successfully implemented. Shift 2 – Shrimp

The R-chart reveals that the 20th, 23rd and 28th intervals are above the UCL. Also, there is an upward trend in the range. This is a degradation in the production capability which is due to some special cause of variation. So the special cause that produced this pattern needs to be determined. Management needs to determine the root cause for this special cause variation and take corrective action. Copyright ©2024 Pearson Education, Inc.


xxxvi Chapter 19: Statistical Applications in Quality Management (online) Since the R-chart indicates an out-of-control process, the interpretation of the chart for the mean will be misleading.

Copyright ©2024 Pearson Education, Inc.


Chapter 20 (Online)

20.1

(a)

Opportunity loss table: Profit of Optimum

Optimum

Alternative Courses of Action

Event

Action

Action

1

B

100

100 – 50 = 50

100 – 100 = 0

2

A

200

200 – 200 = 0

200 – 125 = 75

A

B

(b) 1

50

A 2 1

200 100

B 2

20.2

125

(a) 1 50 A

2 300 3 500

1 10 B

2 100 3 200

(b)

Opportunity loss table: Profit of Copyright ©2024 Pearson Education, Inc. v


Optimum

Optimum

Alternative Courses of Action

Event

Action

Action

A

B

1

A

50

50 – 50 = 0

50 – 10 = 40

2

A

300

300 – 300 = 0

300 – 100 = 200

3

A

500

500 – 500 = 0

500 – 200 = 300

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems vii 20.3

(a)-(c) Payoff table: Action Event

A = Build large factory

B = Build small factory

1

10,000  10 – 400,000 = –300,00010,000  10 – 200,000 = –100,000

2

20,000  10 – 400,000 = –200,00020,000  10 – 200,000 = 0

3

50,000  10 – 400,000 = 100,00050,000  10 – 200,000 = 300,000

4

100,000  10 – 400,000 = 600,00050,000  10 – 200,000 = 300,000

(d) 1 2

-300,000 -200,000

A

100,000 600,000

4 1 2

B

-100,000 0 300,000 300,000

4

(e)

Opportunity loss table: Profit of Optimum

Optimum

Alternative Courses of Action

Event

Action

Action

A

1

B

– 100,000

– 100,000 – (– 300,000)

B – 100,000 – (– 100,000)

= 200,000= 0

2

B

0

0 – (– 200,000)

0–0=0

= 200,000

3

B

300,000

300,000 – (100,000)

300,000 – 300,000

= 200,000= 0

4

A

600,000

600,000 – 600,000

Copyright ©2024 Pearson Education, Inc.

600,000 – 300,000


= 0= 300,000

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ix 20.4

(a)-(b) Payoff table: Action Event

Company A

Company B

1

$10,000 + $2  1,000 =

$12,000

$2,000 + $4  1,000 =

$6,000

2

$10,000 + $2  2,000 =

$14,000

$2,000 + $4  2,000 =

$10,000

3

$10,000 + $2  5,000 =

$20,000

$2,000 + $4  5,000 =

$22,000

4

$10,000 + $2  10,000 = $30,000

$2,000 + $4  10,000 = $42,000

5

$10,000 + $2  50,000 = $110,000

$2,000 + $4  50,000 = $202,000

(c) 1 2 A 4 1 2 3 4 5

B

(d)

$12,000 $14,000

$20,000 $30,000 $110,000 $6,000 $10,000 $22,000 $42,000 $202,000

Opportunity loss table: Profit of

20.5

Optimum

Optimum

Alternative Courses of Action

Event

Action

Action

1

A

12,000

0

6,000

2

A

14,000

0

4,000

3

B

22,000

2,000

0

4

B

42,000

12,000

0

5

B

202,000

92,000

0

A

B

(a)-(b) Payoff table: A Event

Buy 100

Action B Buy 200

C

D

Buy 500

Buy 1,000

Copyright ©2024 Pearson Education, Inc.


1: Sell 100

1,000

200

– 2,200

– 6,200

2: Sell 200

1,000

2,000

– 400

– 4,400

3: Sell 500

1,000

2,000

5,000

1,000

4: Sell 1,000

1,000

2,000

5,000

10,000

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xi 20.5 cont.

(c) 1 2

1,000 1,000

A

1,000 1,000

4 1 2

B

4 1 2

C

200 2,000 2,000 2,000 -2,200 -400 5,000

4 1 2

D

4

(d)

5,000 -6,200 -4,400 1,000 10,000

Opportunity loss table: Profit of Optimum

Optimum

Event

Action

Action

1

A

1,000

2

B

3 4

Alternative Courses of Action A

B

C

D

0

800

3,200

7,200

2,000

1,000

0

2,400

6,400

C

5,000

4,000

3,000

0

4,000

D

10,000

9,000

8,000

5,000

0

Copyright ©2024 Pearson Education, Inc.


20.6

Excel output: Probabilities & Payoffs Table: P

A

B

E1

0.5

50

100

E2

0.5

200

125

Max

200

125

Min

50

100

Maximax

(a)

200

maximin

(b)

Statistics for:

A

Expected Monetary Value

(c)

100 B

125

(c) 112.5

5625

156.25

Standard Deviation

75

12.5

Coefficient of Variation

0.6

0.111111

Return to Risk Ratio

1.666667

9

Variance

Opportunity Loss Table: Optimum

Optimum

Alternatives

Action

Profit

A

B

E1

B

100

50

0

E2

A

200

0

75

A Expected Opportunity Loss

(d) 25

B (d) 37.5

EVPI

(a) (b) (c) (d) (e)

The optimal action based on the maximax criterion is Action A. The optimal action based on the maximin criterion is Action B. EMVA = 50(0.5) + 200(0.5) = 125EMVB = 100(0.5) + 125(0.5) = 112.50 EOLA = 50(0.5) + 0(0.5) = 25EOLB = 0(0.5) + 75(0.5) = 37.50 Perfect information would correctly forecast which event, 1 or 2, will occur. The value of perfect information is the increase in the expected value if you knew which of the events 1 or 2 would occur prior to making a decision between actions. It allows us to select the optimum action given a correct forecast. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xiii EMV with perfect information = 100 (0.5) + 200 (0.5) = 150 EVPI = EMV with perfect information – EMVA = 150 – 125 = 25 (f) (g)

Based on (c) and (d) above, select action A because it has a higher expected monetary value (a) and a lower opportunity loss (b) than action B.  A 2 = (50 – 125)2 (0.5) + (200 – 125)2 (0.5) = 5625  A = 75 75  100%  60% CVA  125  B 2 = (100 – 112.5)2 (0.5) + (125 – 112.5)2 (0.5) = 156.25  B = 12.5 12.5  100%  11.11% CVB  112.5

Copyright ©2024 Pearson Education, Inc.


20.6

(h)

cont. (i) (j)

20.7

125  1.667 75 112.5  9.0 Return-to-risk ratio for B = 12.5 Based on (g) and (h), select action B because it has a lower coefficient of variation and a higher return-to-risk ratio. The best decision depends on the decision criteria. In this case, expected monetary value leads to a different decision than the return-to-risk ratio. Return-to-risk ratio for A =

PHStat output: Expected Monetary Value

Probabilities & Payoffs Table: P

A

B

E1

0.8

50

10

E2

0.1

300

100

E3

0.1

500

200

Max

500

200

Min

50

10

Maximax

500

Maximin

50

Statistics for:

A

Expected Monetary Value

B 120

38

21600

3636

Standard Deviation

146.9694

60.29925

Coefficient of Variation

1.224745

1.586822

Return to Risk Ratio

0.816497

0.63019

Variance

Opportunity Loss Table: Optimum

Optimum

Alternatives

Action

Profit

A

B

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xv E1 A

50

0

40

E2 A

300

0

200

E3 A

500

0

300

A Expected Opportunity Loss

B 0

82

EVPI

(a) (b) (c) (d) (e) (f) (g)

The optimal action based on maximax criterion is Action A. The optimal action based on maximin criterion is Action A. EMVA = 50(0.8) + 300(0.1) + 500 (0.1) = 120 EMVB = 10(0.8) + 100(0.1) + 200 (0.1) = 38 EOLA = 0(0.8) + 0(0.1) + 0(0.1) = 0 EOLB = 40(0.8) + 200(0.1) + 300(0.1) = 82 EVPI = 0. The expected value of perfect information is zero because the optimum decision is action A across all three event states. Based on the results of (c) and (d), select action A because it has a higher expected monetary value than action B and an opportunity loss of zero.  A 2 = (50 – 120)2 (0.8) + (300 – 120)2 (0.1) + (500 – 120)2 (0.1) = 21,600  A = 146.97 146.97 CVA 100%  = 122.47% 120

Copyright ©2024 Pearson Education, Inc.


20.7 cont.

(g)

(h)

(i) (j) (k)

20.8

(a)

Rate of return =

(b)

CV =

(c) 20.9

 B 2 = (10 – 38)2 (0.8) + (100 – 38)2 (0.1) + (200 – 38)2 (0.1) = 3636  B = 60.299 60.299  100%  158.68% CVB  38 120 Return-to-risk ratio for A = = 0.816 146.97 38  0.630 Return-to-risk ratio for B = 60.299 Based on the results of (g) and (h), select action A because it has a lower coefficient of variation and a higher return-to-risk ratio than action B. The recommendation to select action A is made consistently across both parts (f) and (i). No, the recommendation to select action A is independent of the probabilities whenever action A is the preferred choice across all event states.

(a) (b)

(c) (d)

$100 100% = 10% $1,000

$25 100% = 25% $100 $100 Return-to-risk ratio = = 4.0 $25

EMV = 50(0.3) + 100(0.3) + 120 (0.3) + 200 (0.1) = 101  2 = (50 – 101)2(0.3) + (100 – 101)2(0.3) + (120 – 101)2(0.3) + (200 – 101)2(0.1) = 1,869  = 43.23 CV = 42.80% Return-to-risk ratio = 2.336

20.10

Select stock A because it has a higher expected monetary value while it has the same standard deviation as stock B.

20.11

Select stock B because it has the same expected monetary value as stock A but a smaller standard deviation.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xvii 20.12

PHStat output: Expected Monetary Value

Probabilities & Payoffs Table: P

Sell Soft Drinks

Sell Ice Cream

Cool weather

0.4

50

30

Warm weather

0.6

60

90

max

60

90

min

50

30

Maximax

90

Maximin

50

Statistics for:

Sell Soft Drinks

Expected Monetary Value

Sell Ice Cream

56

66

24

864

Standard Deviation

4.898979

29.39388

Coefficient of Variation

0.087482

0.445362

Return to Risk Ratio

11.43095

2.245366

Variance

Opportunity Loss Table: Optimum

Optimum

Action

Profit

Alternatives Sell Soft Drinks Sell Ice Cream

Cool weather Sell Soft Drinks

50

0

20

Warm weather Sell Ice Cream

90

30

0

Sell Soft Drinks Sell Ice Cream Expected Opportunity Loss

18

8 EVPI

(a) (b)

The optimal action based on the maximax criterion is to sell ice cream. The optimal action based on the maximin criterion is to sell soft drinks. Copyright ©2024 Pearson Education, Inc.


(c) (d) (e) (f) (g)

EMV(Soft drinks) = 50(0.4) + 60(0.6) = 56 EMV(Ice cream) = 30(0.4) + 90(0.6) = 66 EOL(Soft drinks) = 0(0.4) + 30(0.6) = 18 EOL(Ice cream) = 20(0.4) + 0(0.6) = 8 EVPI is the maximum amount of money the vendor is willing to pay for the information about which event will occur. Based on (c) and (d), choose to sell ice cream because you will earn a higher expected monetary value and incur a lower opportunity loss than choosing to sell soft drinks. CV(Soft drinks) = 4.899 100% = 8.748% 56

CV(Ice cream) = 29.394 100% = 44.536% 66

56 = 11.431 4.899 66 Return-to-risk ratio for ice cream = = 2.245 29.394

(h)

Return-to-risk ratio for soft drinks =

(i)

To maximize return and minimize risk, you will choose to sell soft drinks because it has the smaller coefficient of variation and the larger return-to-risk ratio.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xix 20.12 cont.

(j)

Ignoring the variability of the payoff in (f), you will choose to sell ice cream. However, when risk, which is measured by standard deviation, is taken into consideration as in coefficient of variation or return-to-risk ratio, you will choose to sell soft drinks because it has the lower variability per unit of expected return or the higher expected return per unit of variability.

20.13

PHStat output: Expected Monetary Value

Probabilities & Payoffs Table: P

Buy 500 Buy 1000 Buy 2000

Sell 500

0.2

500

0

-1000

Sell 1000

0.4

500

1000

0

Sell 2000

0.4

500

1000

2000

Max

500

1000

2000

Min

500

0

-1000

Maximax

2000

Maximin

500

Statistics for:

Buy 500 Buy 1000 Buy 2000

Expected Monetary Value

500

800

600

Variance

0

160000 1440000

Standard Deviation

0

400

1200

Coefficient of Variation

0

0.5

2

Return to Risk Ratio #DIV/0!

2

0.5

Opportunity Loss Table: Optimum

Optimum

Action

Profit

Sell 500 Buy 500

Alternatives Buy 500 Buy 1000 Buy 2000

500

0

500

1500

Copyright ©2024 Pearson Education, Inc.


Sell 1000 Buy 1000

1000

500

0

1000

Sell 2000 Buy 2000

2000

1500

1000

0

Buy 500 Buy 1000 Buy 2000 Expected Opportunity Loss

800

500

700

EVPI

(a) (b) (c) (d)

(e)

See the table above. The optimal action based on the maximax criterion is to buy 2000 pounds. The optimal action based on the maximin criterion is to buy 500 pounds. EMVA = 500(0.2) + 500(0.4) + 500(0.4) = 500 EMVB = 0(0.2) + 1,000(0.4) + 1,000(0.4) = 800 EMVC = – 1,000(0.2) + 0(0.4) + 2,000(0.4) = 600 Based on the expected monetary value, the company should purchase 1,000 pounds of clams and will expect to net $800 for the activity.  A 2 = (500 – 500)2 (0.2) + (500 – 500)2 (0.4) + (500 – 500)2 (0.4) = 0 A = 0  B 2 = (0 – 800)2 (0.2) + (1,000 – 800)2 (0.4) + (1,000 – 800)2 (0.4) = 160,000  B = 400  C 2 = (– 1,000 – 600)2(0.2) + (0 – 600)2(0.4) + (2,000 – 600)2(0.4) = 1,440,000  C = 1,200

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxi 20.13 cont.

(f)

Opportunity loss table:

Profit of

(g)

(h)

(i)

(j)

(k) (l)

Optimum

Optimum

Event

Action

Action

1

A

500

2

B

3

C

Alternative Courses of Action A

B

C

0

500

1,500

1,000

500

0

1,000

2,000

1,500

1,000

0

EOLA = 0(0.2) + 500(0.4) + 1,500(0.4) = 800 EOLB = 500(0.2) + 0(0.4) + 1,000(0.4) = 500 EOLC = 1,500(0.2) + 1,000(0.4) + 0(0.4) = 700 EMV with perfect information = 500(0.2) + 1,000(0.4) + 2,000(0.4) = 1,300 EVPI = EMV with perfect information – EMVB = 1,300 – 800 = 500 The company should not be willing to pay more than $500 for a perfect forecast. 0 400  100%  0% CVB   100%  50% CVA  800 500 1, 200  100%  200% CVC  600 500 Return-to-risk ratio for A = = undefined 0 800 600 Return-to-risk ratio for B = = 2.0Return-to-risk ratio for C = = 0.5 400 1, 200 Choose to buy 1,000 pounds of clams. Buying 1,000 pounds has the highest expected monetary value ($800), the lowest expected opportunity loss ($500), and the larger of the two return-to-risk ratios with defined solutions. There is no discrepancy. PHStat output: Probabilities & Payoffs Table: P

Buy 500

Buy 1000

Buy 2000

Sell 500

0.2

750

250

-750

Sell 1000

0.4

750

1500

500

Sell 2000

0.4

750

1500

3000

Max

750

1500

3000

Min

750

250

-750

Maximax

Copyright ©2024 Pearson Education, Inc.

3000


Maximin

750

Statistics for:

Buy 500

Buy 1000

Buy 2000

750

1250

1250

0

250000

2250000

Standard Deviation

0

500

1500

Coefficient of Variation

0

0.4

1.2

Return to Risk Ratio

#DIV/0!

2.5

0.833333

Expected Monetary Value Variance

Opportunity Loss Table: Optimum

Optimum

Action

Profit

Alternatives Buy 500

Buy 1000

Buy 2000

Sell 500

Buy 500

750

0

500

1500

Sell 1000

Buy 1000

1500

750

0

1000

Sell 2000

Buy 2000

3000

2250

1500

0

Buy 500

Buy 1000

Buy 2000

1200

700

700

Expected Opportunity Loss

EVPI

Copyright ©2024 Pearson Education, Inc.

EVPI


Solutions to End-of-Section and Chapter Review Problems xxiii 20.13 cont.

(l)

(a) See the table above. (b) The optimal action based on the maximax criterion is to buy 2000 pounds. (c) The optimal action based on the maximin criterion is to buy 500 pounds. (d) EMVA = 750(0.2) + 750(0.4) + 750(0.4) = 750 EMVB = 250(0.2) + 1,500(0.4) + 1,500(0.4) = 1,250 EMVC = – 750(0.2) + 500(0.4) + 3,000(0.4) = 1,250 Based solely on the expected monetary value, the company should purchase 1,000 or 2,000 pounds of clams and will expect to net $1,250 for the activity. (e)  A 2= (750 – 750)2 (0.2) + (750 – 750)2 (0.4) + (750 – 750)2 (0.4) = 0 A = 0  B 2 = (250 – 1,250)2 (0.2) + (1,500 – 1,250)2 (0.4) + (1,500 – 1,250)2 (0.4) = 250,000  B = 500  C 2 = (– 750 – 1,250)2(0.2) + (500 – 1,250)2(0.4) + (3,000 – 1,250)2(0.4) = 2,250,000  C = 1,500 (f) EOLA = 0(0.2) + 750(0.4) + 2,250(0.4) = 1,200 EOLB = 500(0.2) + 0(0.4) + 1,500(0.4) = 700 EOLC = 1,500(0.2) + 1,000(0.4) + 0(0.4) = 700 (g) EMV with perfect information = 750(0.2) + 1,500(0.4) + 3,000(0.4) = 1,950 EVPI = EMV, perfect information – EMVB or C = 1,950 – 1,250 = 700 The company should not be willing to pay more than $700 for a perfect forecast. 0 500  100%  0% CVB  (h) CVA  100%  40% 750 1,250 1,500 CVC  100%  120% 1,250 750 (i) Return-to-risk ratio for A = = undefined 0 1, 250 Return-to-risk ratio for B = = 2.5 500 1, 250 Return-to-risk ratio for C = = 0.833 1,500 (j) Buy 1,000 or 2,000 pounds of clams, actions B or C. Buying 1,000 or 2,000 pounds has the highest expected monetary value ($1,250) and the lowest expected opportunity loss ($700). But action B has the higher return-to-risk ratio and is the best choice with respect to the return-to-risk.

Copyright ©2024 Pearson Education, Inc.


20.13 cont.

(m)

PHStat output:

Probabilities & Payoffs Table: P

Buy 500

Buy 1000 Buy 2000

Sell 500

0.4

500

0

-1000

Sell 1000

0.4

500

1000

0

Sell 2000

0.2

500

1000

2000

Max

500

1000

2000

Min

500

0

-1000

Maximax

2000

Maximin

500

Statistics for:

Buy 500

Expected Monetary Value

Buy 1000 Buy 2000

500

600

0

Variance

0

240000

1200000

Standard Deviation

0 489.8979 1095.445

Coefficient of Variation

0 0.816497

#DIV/0!

Return to Risk Ratio

#DIV/0! 1.224745

0

Opportunity Loss Table: Optimum Optimum Action

Profit

Alternatives Buy 500

Buy 1000 Buy 2000

Sell 500 Buy 500

500

0

500

1500

Sell 1000 Buy 1000

1000

500

0

1000

Sell 2000 Buy 2000

2000

1500

1000

0

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxv Buy 500 Expected Opportunity Loss

500

Buy 1000 Buy 2000 400

1000

EVPI (a) See the table above. (b) The optimal action based on the maximax criterion is to buy 2000 pounds. (c) The optimal action based on the maximin criterion is to buy 500 pounds. (d) EMVA = 500(0.4) + 500(0.4) + 500(0.2) = 500 EMVB = 0(0.4) + 1,000(0.4) + 1,000(0.2) = 600 EMVC = – 1,000(0.4) + 0(0.4) + 2,000(0.2) = 0 Based solely on the expected monetary value, the company should purchase 1,000 pounds of clams and will expect to net $600 for the activity. (e)  A 2= (500 – 500)2 (0.4) + (500 – 500)2 (0.4) + (500 – 500)2 (0.2) = 0 A = 0  B 2 = (0 – 600)2 (0.4) + (1,000 – 600)2 (0.4) + (1,000 – 600)2 (0.2) = 240,000  B = 489.90  C 2 = (– 1,000 – 0)2(0.4) + (0 – 0)2(0.4) + (2,000 – 0)2(0.2) = 1,200,000  C = 1,095.45

Copyright ©2024 Pearson Education, Inc.


20.13 cont.

(m)

(f) Opportunity loss table:

Profit of Optimum

Optimum

Alternative Courses of Action

Event

Action

Action

A

B

C

1

A

500

0

500

1,500

2

B

1,000

500

0

1,000

3

C

2,000

1,500

1,000

0

EOLA = 0(0.4) + 500(0.4) + 1,500(0.2) = 500 EOLB = 500(0.4) + 0(0.4) + 1,000(0.2) = 400 EOLC = 1,500(0.4) + 1,000(0.4) + 0(0.2) = 1,000 (g) EMV with perfect information = 500(0.4) + 1,000(0.4) + 2,000(0.2) = 1,000 EVPI = EMV, perfect information – EMVB = 1,000 – 600 = 400 The company should not be willing to pay more than $400 for a perfect forecast. 0 489.90  100%  0% CVB   100%  81.65% (h) CVA  600 500 1,095.45  100%  undefined CVC  0 500 (i) Return-to-risk ratio for A = = undefined 0 600 Return-to-risk ratio for B = = 1.22 489.90 0 Return-to-risk ratio for C = =0 1,095.45 (j) Buy 1,000 pounds of clams, action B. Buying 1,000 pounds has the highest expected monetary value ($600) and the lowest expected opportunity loss ($400). Action B has the higher return-to-risk ratio and is the best choice with respect to the return-to-risk. (k) Although the values of EMV, EOL and  are affected by $.50/pound changes in the profit and by shifts in the probability with which events occur, the recommendation for action B remains unaffected.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxvii 20.14

PHStat output: Probabilities & Payoffs Table: P

A

B

C

Economy declines

0.3

500

-2000

-7000

No change

0.5

1000

2000

-1000

Economy expands

0.2

2000

5000

20000

Max

2000

5000

20000

Min

500

-2000

-7000

Maximax

20000

Maximin

500

Statistics for:

A

B

C

Expected Monetary Value

1050

1400

1400

272500

6240000

93240000

Standard Deviation

522.0153

2497.999

9656.086

Coefficient of Variation

0.497157

1.784285

6.897204

Return to Risk Ratio

2.011435

0.560449

0.144986

Variance

Opportunity Loss Table: Optimum

Optimum

Alternatives

Action

Profit

A

B

C

Economy declines

A

500

0

2500

7500

No change

B

2000

1000

0

3000

Economy expands

C

20000

18000

15000

0

A

B

C

4100

3750

3750

Expected Opportunity Loss

EVPI

Copyright ©2024 Pearson Education, Inc.

EVPI


(a) (b) (c)

(d)

(e)

(f) (g)

The optimal action based on the maximax criterion is to choose investment C. The optimal action based on the maximin criterion is to choose investment A. EMVA = 500(0.3) + 1,000(0.5) + 2,000(0.2) = 1,050 EMVB = – 2,000(0.3) + 2,000(0.5) + 5,000(0.2) = 1,400 EMVC = – 7,000(0.3) – 1,000(0.5) + 20,000(0.2) = 1,400 See the table above. EOLA = 0(0.3) + 1,000(0.5) + 18,000(0.2) = 4,100 EOLB = 2,500(0.3) + 0(0.5) + 15,000(0.2) = 3,750 EOLC = 7,500(0.3) + 3,000(0.5) + 0(0.2) = 3,750 EMV with perfect information = 500(0.3) + 2,000(0.5) + 20,000(0.2) = 5,150 EVPI = EMV with perfect information – EMVB or C = 5,150 – 1,400 = 3,750 The investor should not be willing to pay more than $3,750 for a perfect forecast. Action B and C maximize the expected monetary value and have the lower opportunity loss  A 2= (500 – 1,050)2 (0.3) + (1,000 – 1,050)2 (0.5) + (2,000 – 1,050)2 (0.2) = 272,500

 A = 522.02

 B 2 = (– 2,000 – 1,400)2 (0.3) + (2,000 – 1,400)2 (0.5) + (5,000 – 1,400)2 (0.2) = 6,240,000

20.14

(g)

cont.

(h)

(i)-(j)

 B = 2,498.00  C 2= (– 7,000 – 1,400)2(0.3) + (– 1,000 – 1,400)2(0.5) + (20,000 – 1,400)2(0.2) = 93,240,000  C = 9656.09 522.02  100%  49.72% CVA  1050 2498.00  100%  178.43% CVB  1400 9656.09  100%  689.72% CVC  1400 1050 Return-to-risk ratio for A = = 2.01 522.02 1400 Return-to-risk ratio for B = = 0.56 2498 1400 Return-to-risk ratio for C = = 0.14 9656.09 Action A minimizes the coefficient of variation and maximizes the investor’s return-torisk.

(k)

(c) Max EMV

(1)

(2)

(3)

(4)

0.1, 0.6, 0.3

0.1, 0.3, 0.6

0.4, 0.4, 0.2

0.6, 0.3, 0.1

C: 4,700

C: 11,000

A or B: 800

A: 800

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxix  Max EMV

 C : 10,169

 C : 11,145

 A : 548

 A : 458

 B : 2,683 (d) Min EOL &

C: 2,550

C: 1,650

(e) EVPI

A: 4,000 or

A: 2,100

B: 4,000

(g) Min CV

A: 40.99%

A: 36.64%

A: 54.77%

A: 57.28%

(h) Max Return-torisk

A: 2.4398

A: 2.7294

A: 1.8257

A: 1.7457

(i) Choice on (g), (h)

Choose A

Choose A

Choose A

Choose A

(j) Compare (c) and (i)

Different:

Different:

Different:

Same: A

(c) C

(c) C

(c) A or B

(j) A

(j) A

(j) A

Copyright ©2024 Pearson Education, Inc.


20.15 PHStat output: Probabilities & Payoffs Table: P

Large factory

Small factory

Sell 10000

0.1

-300000

-100000

Sell 20000

0.4

-200000

0

Sell 50000

0.2

100000

300000

Sell 100000

0.3

600000

300000

Max

600000

300000

Min

-300000

-100000

Maximax

600000

Maximin

-100000

Statistics for:

Large factory

Expected Monetary Value

Small factory

90000

140000

1.27E+11

2.64E+10

Standard Deviation

356230.3

162480.8

Coefficient of Variation

3.958114

1.160577

Return to Risk Ratio

0.252646

0.86164

Variance

Opportunity Loss Table: Optimum

Optimum

Action

Profit

Alternatives Large factory

Small factory

Sell 10000 Small factory

-100000

200000

0

Sell 20000 Small factory

0

200000

0

Sell 50000 Small factory

300000

200000

0

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxxi Sell 100000 Large factory

600000

0

Large factory Expected Opportunity Loss

300000

Small factory

140000

90000 EVPI

(a) (b) (c)

(d)

(e)

(f) (g) 20.15

(h)

cont. (i) (j) (k)

The optimal action based on the maximax criterion is to build a large factory. The optimal action based on the maximin criterion is to build a small factory. See the table above. EMVA= – 300,000(0.1) + – 200,000(0.4) + 100,000(0.2) + 600,000(0.3) = 90,000 EMVB= – 100,000(0.1) + 0(0.4) + 300,000(0.2) + 300,000(0.3) = 140,000 See the table above. EOLA = 200,000(0.1) + 200,000 (0.4) + 200,000 (0.2) + 0(0.3) = 140,000 EOLB = 0(0.1) + 0(0.4) + 0(0.2) + 300,000(0.3) = 90,000 EMV with perfect information = – 100,000(0.1) + 0(0.4) + 300,000(0.2) + 600,000(0.3) = 230,000 EVPI = EMV, perfect information – EMVB = 230,000 – 140,000 = 90,000 The company should not be willing to pay more than $90,000 for a perfect forecast. The company should build a small factory to maximize expected monetary value ($140,000) and minimize expected opportunity loss ($90,000). 356,230 162,481 CVA  100%  395.81% CVB  100%  116.06% 90,000 140,000 90,000 Return-to-risk ratio for A = = 0.2526 356,230 140,000 Return-to-risk ratio for B = = 0.8616 162, 481 To minimize risk and maximize the return-to-risk, the company should decide to build a small plant. There are no discrepancies. PHStat output: Probabilities & Payoffs Table: P

Large factory

Small factory

Sell 10000

0.4

-300000

-100000

Sell 20000

0.2

-200000

0

Sell 50000

0.2

100000

300000

Sell 100000

0.2

600000

300000

600000

300000

Max

Copyright ©2024 Pearson Education, Inc.


Min

-300000

Maximax

600000

Maximin

-100000

-100000

Statistics for:

Large factory

Expected Monetary Value

Small factory

-20000

80000

1.18E+11

3.36E+10

Standard Deviation

342928.6

183303

Coefficient of Variation

-17.1464

2.291288

Return to Risk Ratio

-0.05832

0.436436

Variance

Opportunity Loss Table: Optimum

Optimum

Action

Profit

Alternatives Large factory

Small factory

Sell 10000

Small factory

-100000

200000

0

Sell 20000

Small factory

0

200000

0

Sell 50000

Small factory

300000

200000

0

Sell 100000

Large factory

600000

0

300000

Expected Opportunity Loss

Large factory

Small factory

160000

60000 EVPI

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxxiii 20.15 cont.

(k)

(c), (d), (g), (h) See the table above. (e) EMV with perfect information = – 100,000(0.4) + 0(0.2) + 300,000(0.2) + 600,000(0.2) = 140,000 EVPI = EMV, perfect information – EMVB = 140,000 – 80,000 = 60,000 Under these conditions, the company should not be willing to pay more than $60,000 for a perfect forecast. (f) The company should build a small factory to maximize expected monetary value ($80,000) and minimize expected opportunity loss ($60,000). (i) To minimize risk and maximize the return-to-risk, the company should decide to build a small plant. (j) There are no discrepancies. The company’s decision is not affected by the changed probabilities.

20.16 PHStat output: Probabilities & Payoffs Table: P

A

B

Demand 1000

0.45

12000

6000

Demand 2000

0.2

14000

10000

Demand 5000

0.15

20000

22000

Demand 10000

0.1

30000

42000

Demand 50000

0.1

110000

202000

Max

110000

202000

Min

14000

10000

Maximax

202000

Maximin

14000

Statistics for:

A

B

Expected Monetary Value

25200

32400

8.29E+08

3.32E+09

Standard Deviation

28791.67

57583.33

Coefficient of Variation

1.142526

1.777263

Return to Risk Ratio

0.875253

0.562663

Variance

Copyright ©2024 Pearson Education, Inc.


Opportunity Loss Table: Optimum

Optimum

Alternatives

Action

Profit

A

B

Demand 1000 A

12000

0

6000

Demand 2000 A

14000

0

4000

Demand 5000 B

22000

2000

0

Demand 10000 B

42000

12000

0

Demand 50000 B

202000

92000

0

A Expected Opportunity Loss

10700

B 3500 EVPI

(a) (b) (c)

The optimal action based on the maximax criterion is to sign with company B. The optimal action based on the maximin criterion is to sign with company A. EMVA= 12,000(0.45) + 14,000(0.2) + 20,000(0.15) + 30,000(0.1) + 110,000(0.1) = 25,200 EMVB= 6,000(0.45) + 10,000(0.2) + 22,000(0.15) + 42,000(0.1) + 202,000(0.1) = 32,400

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxxv 20.16 cont.

(d)

(g)

EOLA= 0(0.45) + 0(0.2) + 2,000(0.15) + 12,000(0.1) + 92,000(0.1) = 10,700 EOLB= 6,000(0.45) + 4,000(0.2) + 0(0.15) + 0(0.1) + 0(0.1) = 3,500 EMV with perfect information = 12,000(0.45) + 14,000(0.2) + 22,000(0.15) + 42,000(0.1) + 202,000(0.1) = 35,900 EVPI = EMV, perfect information – EMVB = 35,900 – 32,400 = 3,500 The author should not be willing to pay more than $3,500 for a perfect forecast. Sign with company B to maximize the expected monetary value ($32,400) and minimize the expected opportunity loss ($3,500). CVA  28, 792 100%  114.25% CVB  57,583 100%  177.73%

(h)

Return-to-risk ratio for A = 25,200 = 0.8752

(e)

(f)

25,200

32,400

28,792

Return-to-risk ratio for B = 32,400 = 0.5627 57,583

(i) (j) (k)

Signing with company A will minimize the author’s risk and yield the higher return-torisk. Company B has a higher EMV than A, but choosing company B also entails more risk and has a lower return-to-risk ratio than A. (c)-(j) See the table below. Probabilities & Payoffs Table: P

A

B

Demand 1000

0.3

12000

6000

Demand 2000

0.2

14000

10000

Demand 5000

0.2

20000

22000

Demand 10000

0.1

30000

42000

Demand 50000

0.2

110000

202000

Max

110000

202000

Min

14000

10000

Maximax Maximin

202000 14000

Statistics for:

A

B

Expected Monetary Value

35400

52800

1.42E+09

5.68E+09

37672.8

75345.6

Variance Standard Deviation

Copyright ©2024 Pearson Education, Inc.


Coefficient of Variation

1.064203

1.427

Return to Risk Ratio

0.93967

0.700771

Opportunity Loss Table: Optimum

Optimum

Alternatives

Action

Profit

A

B

Demand 1000

A

12000

0

6000

Demand 2000

A

14000

0

4000

Demand 5000

B

22000

2000

0

Demand 10000

B

42000

12000

0

Demand 50000

B

202000

92000

0

Expected Opportunity Loss

A

B

20000

2600 EVPI

The author’s decision is not affected by the changed probabilities.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxxvii 20.17

PHStat output: Probabilities & Payoffs Table: P

Purchase 100

Purchase 200

Purchase 500

Purchase 1000

Sell 100

0.2

1000

200

-2200

-6200

Sell 200

0.5

1000

2000

-400

-4400

Sell 500

0.2

1000

2000

5000

1000

Sell 1000

0.1

1000

2000

5000

10000

Max

1000

2000

5000

10000

Min

1000

200

-2200

-6200

Maximax

10000

Maximin

1000

Statistics for:

Purchase 100

Purchase 200

Purchase 500

1000

1640

860

-2240

0

518400

7808400

22550400

Standard Deviation

0

720

2794.351

4748.726

Coefficient of Variation

0

0.439024

3.249246

-2.11997

Return to Risk Ratio

#DIV/0!

2.277778

0.307764

-0.47171

Expected Monetary Value Variance

Purchase 1000

Opportunity Loss Table: Optimum

Optimum

Action

Profit

Alternatives Purchase 100

Purchase 200

Purchase 500

Purchase 1000

Sell 100

Purchase 100

1000

0

800

3200

7200

Sell 200

Purchase 200

2000

1000

0

2400

6400

Sell 500

Purchase 500

5000

4000

3000

0

4000

Copyright ©2024 Pearson Education, Inc.


Sell 1000

Purchase 1000

Expected Opportunity Loss

10000

9000

8000

5000

0

Purchase 100

Purchase 200

Purchase 500

Purchase 1000

2200

1560

2340

5440

EVPI

(a) (b) (c)

The optimal action based on the maximax criterion is to purchase 1000 trees. The optimal action based on the maximin criterion is to purchase 100 trees. Buy 100: EMVA = 1,000(0.2) + 1,000(0.5) + 1,000(0.2) + 1,000(0.1) = 1,000 Buy 200: EMVB = 200(0.2) + 2,000(0.5) + 2,000(0.2) + 2,000(0.1) = 1,640 Buy 500: EMVC = – 2,200(0.2) – 400(0.5) + 5,000(0.2) + 5,000(0.1) = 860 Buy 1,000: EMVD = – 6,200(0.2) – 4,400(0.5) + 1,000(0.2) + 10,000(0.1) = – 2,240

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xxxix 20.17 cont.

(d)

(e)

(f) (g)

(h)

(i) (j) (k)

EOLA = 0(0.2) + 1,000(0.5) + 4,000(0.2) + 9,000(0.1) = 2,200 EOLB = 800(0.2) + 0(0.5) + 3,000(0.2) + 8,000(0.1) = 1,560 EOLC = 3,200(0.2) + 2,400(0.5) + 0(0.2) + 5,000(0.1) = 2,340 EOLD = 7,200(0.2) + 6,400(0.5) + 4,000(0.2) + 0(0.1) = 5,440 EMV with perfect information = 1,000(0.2) + 2,000(0.5) + 5,000(0.2) + 10,000(0.1) = 3,200 EVPI = EMV, perfect information – EMVB = 3,200 – 1,640 = 1,560 The garden center management should not be willing to pay more than $1,560 for a perfect forecast. The garden center management should buy 200 trees, action B, to maximize expected monetary value ($1,640) and minimize expected opportunity loss ($1,560). 0 720 CVA  100%  0% CVB  100%  43.90% 1,000 1,640 2,794.35 4,748.73  100%  324.92% CVD  CVC  100%  212.00% 860 2,240 1,000 Return-to-risk ratio for A = = undefined 0 1,640 Return-to-risk ratio for B = = 2.2778 720 860 Return-to-risk ratio for C = = 0.3078 2,794 –2,240 Return-to-risk ratio for D = = – 0.4717 4,749 To minimize risk and maximize the return-to-risk, management should decide to buy 200 trees, action B. There are no discrepancies. (c)-(j) See the table below. PHStat output: Probabilities & Payoffs Table: P

Purchase 100 Purchase 200 Purchase 500 Purchase 1000

Sell 100

0.4

1000

200

-2200

-6200

Sell 200

0.2

1000

2000

-400

-4400

Sell 500

0.2

1000

2000

5000

1000

Sell 1000

0.2

1000

2000

5000

10000

Max

1000

2000

5000

10000

Min

1000

200

-2200

-6200

Maximax Maximin

10000 1000

Copyright ©2024 Pearson Education, Inc.


Statistics for: Expected Monetary Value

Purchase 100 Purchase 200 Purchase 500 Purchase 1000 1000

1280

1040

-1160

0

777600

10886400

38102400

Standard Deviation

0

881.8163

3299.455

6172.714

Coefficient of Variation

0

0.688919

3.172552

-5.32131

Return to Risk Ratio

#DIV/0!

1.451549

0.315204

-0.18792

Variance

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xli 20.17 cont.

(k) Opportunity Loss Table: Optimum

Optimum

Action

Profit

Alternatives Purchase 100 Purchase 200 Purchase 500 Purchase 1000

Sell 100 A

1000

0

800

3200

7200

Sell 200 B

2000

1000

0

2400

6400

Sell 500 C

5000

4000

3000

0

4000

Sell 1000 D

10000

9000

8000

5000

0

Purchase 100 Purchase 200 Purchase 500 Purchase 1000 Expected Opportunity Loss

2800

2520

2760

4960

EVPI

The garden center management should not be willing to pay more than $2,520 for a perfect forecast. The change in probabilities did not result in a different decision. Under these conditions, the garden center should buy 200 trees. 20.18

(a)

(b) (c) (d) (e) (f)

P( F | E1 )  P( E1 ) 0.6(0.5)  = 0.6 P( F | E1 )  P( E1 )  P( F | E2 )  P( E2 ) 0.6(0.5)  0.4(0.5) P( E2 | F )  1 – P( E1 | F ) = 1 – 0.6 = 0.4 EMVA = (0.6)(50) + (0.4)(200) = 110 EMVB = (0.6)(100) + (0.4)(125) = 110 EOLA = (0.6)(50) + (0.4)(0) = 30 EOLB = (0.6)(0) + (0.4)(75) = 30 EVPI = (0.6)(100) + (0.4)(200) = 30 You should not be willing to pay more than $30 for a perfect forecast. Both have the same EMV and the same EOL.  A2 = (0.6)(-60)2 + (0.4)(90)2 = 5400  A = 73.4847 P( E 1 | F ) 

 B2 = (0.6)(-10)2 + (0.4)(15)2 = 150  B = 12.2474 73.4847 12.2474  100% = 66.8% CVB   100% = 11.1% 110 110 110 Return-to-risk ratio for A = = 1.497 73.4847 110 Return-to-risk ratio for B = = 8.981 12.2474 Action B has a better return-to-risk ratio. Both have the same EMV, but action B has a better return-to-risk ratio. CVA 

(g)

(h) (i)

Copyright ©2024 Pearson Education, Inc.


20.19

(a)

P( E 1 | F ) 

P( F | E1 )  P( E1 ) P( F | E1 )  P( E1 )  P( F | E2 )  P( E2 )  P  F | E3   P  E3 

0.2(0.8) = 0.667 or 2/3 0.2(0.8)  0.4(0.1)  0.4(0.1) 0.4(0.1) P( E2 | F )  = 0.167 or 1/6 0.2(0.8)  0.4(0.1)  0.4(0.1) P(E3 | F) = 1 – P(E1 | F) – P(E2 | F) = 1 – 0.667 – 0.167 = 0.167 or 1/6 EMVA = (0.667)(50) + (0.167)(300) + (0.167)(500) = 166.95 EMVB = (0.667)(10) + (0.167)(100) + (0.167)(200) = 56.77 EOLA = (0.667)(0) + (0.167)(0) + (0.167)(0) = 0 EOLB = (0.667)(40) + (0.167)(200) + (0.167)(300) = 110.18 EVPI = 0 You should not be willing to pay any money for a perfect forecast. Action A has a higher EMV and is better for all events.  A2 = (0.667)( 13677.3025) + (0.167)( 17702.3025) + (0.167)( 110922.3025) = 30603.0698  A = 174.937 

(b) 20.19 cont.

(c) (d) (e) (f)

 B2 = (0.667)( 2187.4329) + (0.167)(1868.8329) + (0.167)( 20514.8329) = 5197.090

(g)

(h) (i) 20.20

(a)

 B = 72.091 174.937 72.091 CVA   100% = 104.78% CVB   100% = 126.99% 166.95 56.77 166.95 Return-to-risk ratio for A = = 0.954 174.937 56.77 Return-to-risk ratio for B = = 0.787 72.091 Action A has a better return-to-risk ratio. Both support action A. P(forecast cool | cool weather) = 0.80 P(forecast warm | warm weather) = 0.70 Forecast Cool 0.8 Cool 0.4

0.32

Forecast Warm 0.08 0.2 Forecast Cool 0.18 0.3

Warm 0.6

Forecast Warm 0.42 0.7 Forecast

Forecast

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xliii Cool

Warm

Totals

Cool

0.32

0.08

0.4

Warm

0.18

0.42

0.6

Totals

0.5

0.5

Revised probabilities:P(cool | forecast cool) = P(warm | forecast cool) =

0.32 = 0.64 0.5

0.18 = 0.36 0.5

Copyright ©2024 Pearson Education, Inc.


20.20 cont.

(a) Cool 0.64

0.32

Warm 0.36 Cool 0.16

0.18

Warm 0.84

0.42

Forecast cool 0.5

Forecast warm 0.5

(b)

0.08

EMV(Soft drinks) = 50(0.64) + 60(0.36) = 53.6 EMV(Ice cream) = 30(0.64) + 90(0.36) = 51.6 EOL(Soft drinks) = 0(0.64) + 30(0.36) = 10.8 EOL(Ice cream) = 20(0.64) + 0(0.36) = 12.8 EMV with perfect information = 50(0.64) + 90(0.36) = 64.4 EVPI = EMV, perfect information – EMVA = 64.4 – 53.6 = 10.8 The vendor should not be willing to pay more than $10.80 for a perfect forecast of the weather. The vendor should sell soft drinks to maximize value and minimize loss. 4.8  100% = 8.96% CV(Soft drinks) = 53.6 28.8  100% = 55.81% CV(Ice cream) = 51.6 53.6 Return-to-risk ratio for soft drinks = = 11.1667 4.8 51.6 Return-to-risk ratio for ice cream = = 1.7917 28.8 Based on these revised probabilities, the vendor’s decision changes because of the increased likelihood of cool weather given a forecast for cool. Under these conditions, she should sell soft drinks to maximize the expected monetary value and also to minimize her expected opportunity loss, as well as minimizing risk and maximizing return.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xlv 20.21

(a)

P(rosy | decline) = 0.2 P(rosy | no change) = 0.4 P(rosy | expanding) = 0.7

Forecast Rosy

Gloomy

Totals

Decline

0.06

0.24

0.3

No Change

0.20

0.30

0.5

Expanding

0.14

0.06

0.2

Totals

0.4

0.60

R osy 0.2

0.06

D ecl ine 0.3 G l oomy 0.8 R osy 0.4 N o C hang e 0.5

G l oomy 0.6 R osy 0.7

0.24 0.20

0.30 0.14

E x pandi ng 0.2 G l oomy 0.3

(b)

0.06

Given a gloomy forecast, revised conditional probabilities are: .24 .30 P(decline | gloomy) = = 0.40P(no change | gloomy) = = 0.50 .60 .60 .06 P(expanding | gloomy) = = 0.10 .60 Payoff table, given gloomy forecast: Pr

A

B

C

Decline

0.4

500

– 2,000

– 7,000

No Change

0.5

1,000

2,000

–1,000

Expanding

0.1

2,000

5,000

20,000

2,000

5,000

20,000

Max

Copyright ©2024 Pearson Education, Inc.


Min

500

-2,000

Maximax

-7,000 20,000

Maximin

500 900

700

– 1,300

190,000

5,610,000

58,410,000

435.89

2,368.54

7,642.64

CV

48.43%

338.36%

– 587.90%

Return-to-risk

2.0647

0.2955

– 0.1701

EMV

2

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xlvii 20.21 cont.

(b)

Opportunity loss table: Profit of

(c)

20.22

(a)

Optimum

Optimum

Event

Action

Action

1

A

500

2

B

3

C

Alternative Courses of Action A

B

C

0

2,500

7,500

2,000

1,000

0

3,000

20,000

18,000

15,000

0

The optimal action based on the maximax criterion is to choose investment C. The optimal action based on the maximin criterion is to choose investment A. EOLA = 0(0.4) + 1,000(0.5) + 18,000(0.1) = 2,300 EOLB = 2,500(0.4) + 0(0.5) + 15,000(0.1) = 2,500 EOLC = 7,500(0.4) + 3,000(0.5) + 0(0.1) = 4,500 EMV with perfect information = 500(0.4) + 2,000(0.5) + 20,000(0.1) = 3200 EVPI = 3,200 – 900 = 2,300 The investor should not be willing to pay more than $2,300 for a perfect forecast. Under the new conditions, action A optimizes the expected monetary value, minimizes the coefficient of variation, and maximizes the investor’s return-to-risk. The probability of decline has increased, which lowered the expected monetary value of actions B and C. P(favorable | 1,000) = 0.01P(favorable | 2,000) = 0.01 P(favorable | 5,000) = 0.25P(favorable | 10,000) = 0.60 P(favorable | 50,000) = 0.99 P(favorable and 1,000)= 0.01(0.45) = 0.0045 P(favorable and 2,000)= 0.01(0.20) = 0.0020 P(favorable and 5,000)= 0.25(0.15) = 0.0375 P(favorable and 10,000)= 0.60(0.10) = 0.0600 P(favorable and 50,000)= 0.99(0.10) = 0.0990 Joint probability table: Favorable

Unfavorable

Totals

1,000

0.0045

0.4455

0.45

2,000

0.0020

0.1980

0.20

5,000

0.0375

0.1125

0.15

10,000

0.0600

0.0400

0.10

50,000

0.0990

0.0010

0.10

Totals

0.2030

0.7970

Copyright ©2024 Pearson Education, Inc.


Given an unfavorable review, the revised conditional probabilities are: P(1,000 | unfavorable)= 0.4455/0.7970 = 0.5590 P(2,000 | unfavorable)= 0.1980/0.7970 = 0.2484 P(5,000 | unfavorable)= 0.1125/0.7970 = 0.1412 P(10,000 | unfavorable)= 0.0400/0.7970 = 0.0502 P(50,000 | unfavorable)= 0.0010/0.7970 = 0.0013

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xlix 20.22 cont.

(b)

Payoff table, given unfavorable review: Pr

A

B

1,000

0.5590

12,000

6,000

2,000

0.2484

14,000

10,000

5,000

0.1412

20,000

22,000

10,000

0.0502

30,000

42,000

50,000

0.0013

110,000

202,000

14,658.60

11,315.4

31,719,333.50

126877326.67

5,631.99

11263.98

CV

38.42%

99.55%

Return-to-risk

2.6027

1.0046

EMV

2

Opportunity loss table: Pr

A

Event 1

0.5590

0

6,000

Event 2

0.2484

0

4,000

Event 3

0.1412

2,000

0

Event 4

0.0502

12,000

0

Event 5

0.0013

92,000

0

EOL

(c)

B

1,004.40

4,347.60

The author’s decision is affected by the changed probabilities. Under the new circumstances, signing with company A maximizes the expected monetary value ($14,658.60), minimizes the expected opportunity loss ($1,004.40), minimizes risk with a smaller coefficient of variation and yields a higher return-to-risk than choosing company B.

20.25

Alternative courses of action represent the choices of the decision-maker. Events are the actual states of the world that can occur.

20.26

A payoff table presents the alternatives in a tabular format, while the probability tree organizes the alternatives and events visually. Copyright ©2024 Pearson Education, Inc.


20.27

The opportunity loss is the difference between the highest possible profit for an event and the actual profit obtained for an action taken.

20.28

Since it is the difference between the highest possible profit for an event and the actual profit obtained for an action taken. It can never be negative.

20.29

Expected monetary value represents the mean profit of an alternative course of action. Expected opportunity loss represents the mean opportunity loss of the alternative course of action as compared to the action that would be taken if you knew the event that was going to occur.

20.30

The expected value of perfect information represents the maximum amount you would pay to obtain perfect information. It represents the alternative course of action with the smallest expected opportunity loss. It is also equal to the expected profit under certainty minus the expected monetary value of the best alternative course of action.

20.31

The expected value of perfect information equals the expected profit under certainty minus the expected monetary value of the best alternative course of action. Expected monetary value measures the mean return or profit of an alternative course of action over the long run without regard for the variability in the payoffs under different events. The return-to-risk ratio considers the variability in the payoffs in evaluating which alternative course of action should be chosen.

20.32

20.33

Bayes’ theorem uses conditional probabilities to revise the probability of an event in the light of new information.

20.34

A risk averter attempts to reduce risk, while a risk seeker looks for increased return usually associated with greater risk.

20.35

Under many circumstances in the business world, the assumption that each incremental change of profit or loss has the same value as the previous amount of profits attained or losses incurred is not valid. Utilities should be used instead of payoffs under such differential evaluation of incremental profits or losses.

20.36

(a), (c), (g), (h)Payoff table: Probabilities & Payoffs Table: P Buy 6,000 Sell 6,000 0.1 6,840 Sell 8,000 0.5 6,840 Sell 10,000 0.3 6,840 Sell 12,000 0.1 6,840

Buy 8,000 Buy 10,000 Buy 12,000 6,340 5,840 5,340 9,120 8,620 8,120 9,120 11,400 10,900 9,120 11,400 13,680

Statistics for: Buy 6,000 Buy 8,000 Buy 10,000 Buy 12,000 Expected Monetary Value 6840 8842 9454 9232 Variance 0 695556 3168644 4946176 Standard Deviation 0 834 1780.068538 2224 Coefficient of Variation 0 0.094322551 0.188287343 0.240901213 Return to Risk Ratio #DIV/0! 10.60191847 5.311031456 4.151079137

(d)

Opportunity loss table: Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems li Opportunity Loss Table: Optimum Optimum Action Profit Sell 6,000 Buy 6,000 6840 Sell 8,000 Buy 8,000 9120 Sell 10,000 Buy 10,000 11400 Sell 12,000 Buy 12,000 13680 Expected Opportunity Loss

Buy 6,000 0 2280 4560 6840 Buy 6,000 3192

Alternatives Buy 8,000 Buy 10,000 Buy 12,000 500 1000 1500 0 500 1000 2280 0 500 4560 2280 0 Buy 8,000 Buy 10,000 Buy 12,000 1190 578 800 EVPI

Copyright ©2024 Pearson Education, Inc.


20.36 cont.

(d) 1 2 A 4 1 2

B

4 1 2

C

4 1 2

D

4

(e) (f) (i)

(j) (k)

6,840 6,840 6,840 6,840 6,340 9,120 9,120 9,120 5,840 8,620 11,400 11,400 5,340 8,120 10,900 13,680

EVPI = $578. The management of Shop-Quick Supermarkets should not be willing to pay more than $578 for a perfect forecast. To maximize the expected monetary value and minimize expected opportunity loss, the management should buy 10,000 loaves. Action B (buying 8,000 loaves) maximizes the return-to-risk and, while buying 6,000 loaves reduces the coefficient of variation to zero, action B has a smaller coefficient of variation than C or D. The results depend on what your objective is. (a), (c), (g), (h) Payoff table: Probabilities & Payoffs Table: P Buy 6,000 Sell 6,000 0.3 6,840 Sell 8,000 0.4 6,840 Sell 10,000 0.2 6,840 Sell 12,000 0.1 6,840

Buy 8,000 Buy 10,000 Buy 12,000 6,340 5,840 5,340 9,120 8,620 8,120 9,120 11,400 10,900 9,120 11,400 13,680

Statistics for: Buy 6,000 Buy 8,000 Buy 10,000 Buy 12,000 Expected Monetary Value 6840 8286 8620 8398 Variance 0 1622964 4637040 6878276 Standard Deviation 0 1273.956043 2153.37874 2622.646755 Coefficient of Variation 0 0.153748014 0.249811919 0.312294208 Return to Risk Ratio #DIV/0! 6.504149059 4.003011564 3.202108704

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems liii 20.36 cont.

(k)

(d) Opportunity loss table: Opportunity Loss Table: Optimum Optimum Action Profit Sell 6,000 Buy 6,000 6840 Sell 8,000 Buy 8,000 9120 Sell 10,000 Buy 10,000 11400 Sell 12,000 Buy 12,000 13680 Expected Opportunity Loss

Buy 6,000 0 2280 4560 6840 Buy 6,000 2508

Alternatives Buy 8,000 Buy 10,000 Buy 12,000 500 1000 1500 0 500 1000 2280 0 500 4560 2280 0 Buy 8,000 Buy 10,000 Buy 12,000 1062 728 950 EVPI

(e) EVPI = $728. The management of Shop-Quick Supermarkets should not be willing to pay more than $728 for a perfect forecast. (f) To maximize the expected monetary value and minimize expected opportunity loss, the management should buy 10,000 loaves. (i) Action B (buying 8,000 loaves) maximizes the return-to-risk and, while buying 6,000 loaves reduces the coefficient of variation to zero, action B has a smaller coefficient of variation than C or D. (j) The results depend on what your objective is. 20.37

(a), (d), (g) Payoff table: Event Pr

A:

B:

Install

Do Not Install

1

50

0.40

– 50,000

0

2

100

0.30

50,000

0

3

200

0.30

250,000

0

EMV

70,000

0

124,900

0

CV

178.43%

undefined

0.5604

undefined

Return-to-risk

(b)

Copyright ©2024 Pearson Education, Inc.


1 A

– 50,000

2

50,000

3

250,000

1 0 B

2

0

3

0

(c), (e) Opportunity loss table: A:

B:

Pr

Install

Do Not Install

Event

1

50

0.40

50,000

0

2

100

0.30

0

50,000

3

200

0.30

0

250,000

20,000

90,000

EOL

20.37 cont.

(f) (h) (i)

EVPI = $20,000. The owner of the home heating-oil delivery company should not be willing to pay more than $20,000 for a perfect forecast. To maximize the expected monetary value and minimize the expected opportunity loss, the owner should offer solar heating. Payoff table: A:

B:

Pr

Install

Do Not Install

50

0.40

– 100,000

0

100

0.30

0

0

200

0.30

200,000

0

EMV

20,000

0

124,900

0

CV

624.50%

undefined

0.1601

undefined

Return-to-risk

Opportunity loss table: Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lv A:

B:

Pr

Install

Do Not Install

50

0.40

100,000

0

100

0.30

0

0

200

0.30

0

200,000

40,000

60,000

EOL

EVPI = $40,000. The owner of the home heating oil delivery company should not be willing to pay more than $40,000 for a perfect forecast. Although individual values are different, the owner’s decision is not affected by the altered start-up costs.

20.38

(a) 1 A

–4,000,000

2

1,000,000

3

5,000,000

1 0 B

2

0

3

0

(c), (f) Payoff table: Event

Pr

New

Old

1

Weak

0.3

– 4,000,000

0

2

Moderate

0.6

1,000,000

0

3

Strong

0.1

5,000,000

0

EMV

– 100,000

0

2,808,914

0

CV

– 2,808.94%

undefined

– 0.0356

undefined

Return-to-risk

Copyright ©2024 Pearson Education, Inc.


20.38 cont.

(b), (d), (e)Opportunity loss table: Pr

New

Weak

0.3

4,000,000

0

Moderate

0.6

0

1,000,000

Strong

0.1

0

5,000,000

1,200,000

1,100,000

EOL

(g) (h)

Old

EVPI = $1,100,000. The product manager should not be willing to pay more than $1,100,000 for a perfect forecast. The product manager should continue to use the old packaging to maximize expected monetary value and to minimize expected opportunity loss and risk. (c), (f)Payoff table: Pr

New

Weak

0.6

– 4,000,000

0

Moderate

0.3

1,000,000

0

Strong

0.1

5,000,000

0

– 1,600,000

0

3,136,877

0

CV

– 196.05%

undefined

– 0.5101

undefined

EMV

Return-to-risk

Old

(b), (d), (e) Opportunity loss table: Pr

New

Weak

0.6

4,000,000

0

Moderate

0.3

0

1,000,000

Strong

0.1

0

5,000,000

2,400,000

800,000

EOL

(i)

Old

EVPI = $800,000. The product manager should not be willing to pay more than $800,000 for a perfect forecast. (g) The product manager should continue to use the old packaging to maximize expected monetary value and to minimize expected opportunity loss and risk. (c), (f)Payoff table: Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lvii Pr

New

Weak

0.1

– 4,000,000

0

Moderate

0.3

1,000,000

0

Strong

0.6

5,000,000

0

2,900,000

0

2,913,760.457

0

100.47%

undefined

0.9953

undefined

EMV

 CV Return-to-risk

Old

(b), (d), (e) Opportunity loss table: Pr

New

Weak

0.1

4,000,000

0

Moderate

0.3

0

1,000,000

Strong

0.6

0

5,000,000

EOL

20.38 cont.

(i) (j)

400,000

Old

3,300,000

EVPI = $400,000. The product manager should not be willing to pay more than $400,000 for a perfect forecast. (g) The product manager should use the new packaging to maximize expected monetary value and to minimize expected opportunity loss. P(Sales decreased | weak response) = 0.6 P(Sales stayed same | weak response) = 0.3 P(Sales increased | weak response) = 0.1 P(Sales decreased | moderate response) = 0.2 P(Sales stayed same | moderate response) = 0.4 P(Sales increased | moderate response) = 0.4 P(Sales decreased | strong response) = 0.05 P(Sales stayed same | strong response) = 0.35 P(Sales increased | strong response) = 0.6 P(Sales decreased and weak response) = 0.6(0.3) = 0.18 P(Sales stayed same and weak response) = 0.3(0.3) = 0.09 P(Sales increased and weak response) = 0.1(0.3) = 0.03 P(Sales decreased and moderate response) = 0.2(0.6) = 0.12 P(Sales stayed same and moderate response) = 0.4(0.6) = 0.24 P(Sales increased and moderate response) = 0.4(0.6) = 0.24 P(Sales decreased and strong response) = 0.05(0.1) = 0.005 P(Sales stayed same and strong response) = 0.35(0.1) = 0.035 P(Sales increased and strong response) = 0.6(0.1) = 0.06 Joint probability table: Copyright ©2024 Pearson Education, Inc.


Sales

Sales

Sales

Pr

Decrease

Stay Same

Increase

Weak

0.3

0.180

0.090

0.030

Moderate

0.6

0.120

0.240

0.240

Strong

0.1

0.005

0.035

0.060

0.305

0.365

0.330

Total

Given the sales stayed the same, the revised conditional probabilities are: P(weak response | sales stayed same) = .09 = 0.2466 .365

P(moderate response | sales stayed same) = .24 = 0.6575 .365

P(strong response | sales stayed same) = .035 = 0.0959 .365

(k)

(c), (f) Payoff table: Pr

New

Old

Weak

0.2466

– 4,000,000

0

Moderate

0.6575

1,000,000

0

Strong

0.0959

5,000,000

0

150,600

0

2,641,575.219

0

1,754.03%

undefined

0.0570

undefined

EMV

 CV Return-to-risk

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lix 20.38 cont.

(k)

(b), (d), (e) Opportunity loss table: Pr

New

Weak

0.2466

4,000,000

0

Moderate

0.6575

0

1,000,000

Strong

0.0959

0

5,000,000

986,400

1,137,000

EOL

(l)

(m)

Old

EVPI = $986,400. The product manager should not be willing to pay more than $986,400 for a perfect forecast. (g) The product manager should use the new packaging to maximize expected monetary value and to minimize expected opportunity loss. Given the sales decreased, the revised conditional probabilities are: .18 P(weak response | sales decreased) = = 0.5902 .305 .12 P(moderate response | sales decreased) = = 0.3934 .305 .005 P(strong response | sales decreased) = = 0.0164 .305 (c), (f) Payoff table: Pr

New

Old

Weak

0.5902

– 4,000,000

0

Moderate

0.3934

1,000,000

0

Strong

0.0164

5,000,000

0

– 1,885,400

0

2,586,864.287

0

– 137.21%

undefined

– 0.7288

undefined

EMV

 CV Return-to-risk

(b), (d), (e) Opportunity loss table: Pr

New

Weak

0.5902

4,000,000

0

Moderate

0.3934

0

1,000,000

Strong

0.0164

0

5,000,000

Old

Copyright ©2024 Pearson Education, Inc.


EOL

2,360,800

475,400

EVPI = $475,400. The product manager should not be willing to pay more than $475,400 for a perfect forecast. (g) The product manager should continue to use the old packaging to maximize expected monetary value and minimize expected opportunity loss.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxi 20.39

(a) 1 2 A

-50,000 60,000 130,000

4 1 2

B

300,000 0 0 0

4

0

(c), (f) Payoff table: A:

B: No

Pr

Garden Service

Garden Service

Event

1

Very low

0.2

– 50,000

0

2

Low

0.5

60,000

0

3

Moderate

0.2

130,000

0

4

High

0.1

300,000

0

76,000

0

94,361.01

0

CV

124.16%

undefined

0.8054

undefined

EMV

Return-to-risk

(b), (d) Opportunity loss table: A:

B: No

Pr

Garden Service

Garden Service

Event

1

Very low

0.2

50,000

0

2

Low

0.5

0

60,000

3

Moderate

0.2

0

130,000

4

High

0.1

0

300,000

10,000

86,000

EOL

(e)

EVPI = $10,000. The entrepreneur should not be willing to pay more than $10,000 for a perfect forecast. Copyright ©2024 Pearson Education, Inc.


(g) (h)

To maximize the expected monetary value and minimize expected opportunity loss, the entrepreneur should provide gardening service. Given 3 events of interest out of 20, the binomial probabilities and their related revised conditional probabilities are: Binomial

Revised Conditional

Pr

Probabilities

Probabilities

Very low

0.2

0.2054

0.2054/0.6019 = 0.3412

Low

0.5

0.0011

0.0011/0.6019 = 0.0018

Moderate

0.2

0.2054

0.2054/0.6019 = 0.3412

High

0.1

0.1901

0.1901/0.6019 = 0.3158

0.6019

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxiii 20.39 cont.

(i)

(c), (f) Payoff table:

Pr

A:

B: No

Garden Service

Garden Service

Very low

0.3412

– 50,000

0

Low

0.0018

60,000

0

Moderate

0.3412

130,000

0

High

0.3158

300,000

0

EMV

122,159.8

0

141,884.9

0

CV

116.15%

undefined

0.8610

undefined

Return-to-risk

(b), (d) Opportunity loss table: A:

B: No

Pr

Garden Service

Garden Service

Very low

0.3412

50,000

0

Low

0.0018

0

60,000

Moderate

0.3412

0

130,000

High

0.3158

0

300,000

17,062.63

139,222.5

EOL

(e) EVPI = $17,062.63. The entrepreneur should not be willing to pay more than $17,062.63 for a perfect forecast. (g) To maximize the expected monetary value and minimize expected opportunity loss, the entrepreneur should provide gardening service. 20.40

(a)

Copyright ©2024 Pearson Education, Inc.


1 2 A

20 100 200

4

B

1 2

4

400 100 100 100 100

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxv 20.40 cont.

(c), (e), (f) Payoff table:*

A: Do Not

B:

Pr

Call Mechanic

Call Mechanic

Event

1

Very low

0.25

20

100

2

Low

0.25

100

100

3

Moderate

0.25

200

100

4

High

0.25

400

100

EMV

180

100

142

0

CV

78.96%

0

Return-to-risk

1.2665

undefined

*Note: The payoff here is cost and not profit. The opportunity cost is therefore calculated as the difference between the payoff and the minimum in the same row. (b), (d) Opportunity loss table:

Pr

(g) (h)

B:

Call Mechanic

Call Mechanic

Very low

0.25

80

0

Low

0.25

0

0

Moderate

0.25

0

100

High

0.25

0

300

20

100

EOL

(e)

A: Do Not

EVPI = $20. The manufacturer should not be willing to pay more than $20 for the information about which event will occur. We want to minimize the expected monetary value because it is a cost. To minimize the expected monetary value, call the mechanic. Given 2 defectives out of 15, the binomial probabilities and their related revised conditional probabilities are:

Pr

Binomial

Revised Conditional

Probabilities

Probabilities

Copyright ©2024 Pearson Education, Inc.


Very low

0.01

0.0092

0.0092/0.6418 = 0.0143

Low

0.05

0.1348

0.1348/0.6418 = 0.2100

Moderate

0.10

0.2669

0.2669/0.6418 = 0.4159

High

0.20

0.2309

0.2309/0.6418 = 0.3598

0.6418

(i)

(c), (e), (f) Payoff table:

Pr

A: Do Not

B:

Call Mechanic

Call Mechanic

Very low

0.0143

20

100

Low

0.2100

100

100

Moderate

0.4159

200

100

High

0.3598

400

100

248.3860

100

121

0

CV

48.68%

0

Return-to-risk

2.0544

undefined

EMV

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxvii 20.40 cont.

(i)

(b), (d) Opportunity loss table:

Pr

A: Do Not

B:

Call Mechanic

Call Mechanic

Very low

0.0143

80

0

Low

0.2100

0

0

Moderate

0.4159

0

100

High

0.3598

0

300

1.1440

149.53

EOL

(e) EVPI = $1.14. The manufacturer should not be willing to pay more than $1.14 for the information about which event will occur. (g) We want to minimize the expected monetary value because it is a cost. To minimize the expected monetary value, call the mechanic. Online Sections

Chapter 5

5.45

PHstat output: Probabilities & Outcomes:

Weight Assigned to X

P

X

Y

0.4

100

200

0.6

200

100

0.5

Statistics E(X)

160

E(Y)

140 Copyright ©2024 Pearson Education, Inc.


Variance(X)

2400

Standard Deviation(X)

48.98979

Variance(Y)

2400

Standard Deviation(Y)

48.98979

Covariance(XY)

-2400

Variance(X+Y)

0

Standard Deviation(X+Y)

0

Portfolio Management Weight Assigned to X

0.5

Weight Assigned to Y

0.5

Portfolio Expected Return

150

Portfolio Risk

0

(a)

E(X) = (0.4)($100) + (0.6)($200) = $160 E(Y) = (0.4)($200) + (0.6)($100) = $140

(b)

 X  (0.4)(100 –160) 2  (0.6)(200 –160) 2  2400  $48.99

(c) (d)

 Y  (0.4)(200 –140) 2  (0.6)(100 –140) 2  2400  $48.99  XY = (0.4)(100 – 160)(200 – 140) + (0.6)(200 – 160)(100 – 140) = – 2400 E(X + Y) = E(X) + E(Y) = $160 + $140 = $300

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxix 5.46

PHStat output: Probabilities & Outcomes:

Weight Assigned to X

P

X

Y

0.2

-100

50

0.4

50

30

0.3

200

20

0.1

300

20

0.5

Statistics E(X)

90

E(Y)

30

Variance(X)

15900

Standard Deviation(X)

126.0952

Variance(Y)

120

Standard Deviation(Y)

10.95445

Covariance(XY)

-1300

Variance(X+Y)

13420

Standard Deviation(X+Y)

115.8447

Portfolio Management Weight Assigned to X

0.5

Weight Assigned to Y

0.5

Portfolio Expected Return

60

Portfolio Risk

(a)

57.92236

E(X) = (0.2)($ – 100) + (0.4)($50) +(0.3)($ 200) + (0.1)($300) = $90 E(Y) = (0.2)($50) + (0.4)($30) + (0.3)($ 20) + (0.1)($20) = $30 Copyright ©2024 Pearson Education, Inc.


(b)

 X  (0.2)(100  90) 2  (0.4)(50  90) 2  (0.3)(200  90) 2  (0.1)(300  90) 2  15900  126.10

 Y  (0.2)(50 – 30) 2  (0.4)(30 – 30) 2  (0.3)(20 – 30) 2  (0.1)(20 – 30) 2

(d)

 120  10.95  XY = (0.2)( –100 – 90)(50 – 30) + (0.4)(50 – 90)(30 – 30) + (0.3)(200 – 90)(20 – 30) + (0.1)(300 – 90)(20 – 30) = –1300 E(X + Y) = E(X) + E(Y) = $90 + $30 = $120

(a)

E(P) = (0.4)($50) + (0.6)($100) = $80

(b)

 P  (.4) 2 (9000)  (.6) 2 (15000)  2(.4)(.6)(7500)  102.18

(a)

E(total time) = E(time waiting) + E(time served) = 4 + 5.5 = 9.5 minutes

(b)

 (total time) = 1.22  1.52  1.9209 minutes

(c)

5.47

5.48

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxxi 5.49

(a)

E(P) = 0.3(65) + 0.7(35) = $44

 P  (0.3) 2 (37,525)  (0.7) 2 (11,025)  2(0.3)(0.7)(19,275)  $26.15 P 26.15

 100%   59.44% E  P 44 E(P) = 0.7(65) + 0.3(35) = $ 56 CV 

(b)

 P  (0.7) 2 (37,525)  (0.3)2 (11,025)  2(0.7)(0.3)(19,275)  $106.23 P 106.23

 100%  189.69% E  P 56 Investing 30% in the Dow Jones index and 70% in the weak-economy fund will yield the lowest risk per unit average return at 59.44%. This will be the investment recommendation if you are a risk-averse investor. CV 

(c)

5.50

PHStat output for (a)-(c): Covariance Analysis Probabilities & Outcomes:

P

X

Y

0.1

-100

50

0.3

0

150

0.3

80

-20

0.3

150

-100

Statistics E(X)

59

E(Y)

14

Variance(X) Standard Deviation(X) Variance(Y) Standard Deviation(Y)

6189 78.6702 9924 99.61928

Covariance(XY)

-6306

Variance(X+Y)

3501

Standard Deviation(X+Y)

59.16925 Copyright ©2024 Pearson Education, Inc.


N

(a)

E(X) =  xi P  xi  = 59 i 1 N

E(Y) =  yi P  yi  = 14 i 1

(b)

X = Y =

N

  x  E  X  P  x  = 78.6702 i 1

2

i

N

i

  y  E Y  P  y  = 99.62 i 1

2

i

i

N

(c)

 XY =   xi  E  X   yi  E Y  P  xi , yi  = 6306 i 1

(d)

Stock X gives the investor a lower standard deviation while yielding a higher expected return so the investor should select stock X.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxxiii 5.51

(a)

PHStat output: Probabilities & Outcomes:

P

Weight Assigned to X

X

Y

0.1

-100

50

0.3

0

150

0.3

80

-20

0.3

150

-100

0.3

Statistics E(X)

59

E(Y)

14

Variance(X)

6189

Standard Deviation(X) Variance(Y)

78.6702 9924

Standard Deviation(Y)

99.61928

Covariance(XY)

-6306

Variance(X+Y)

3501

Standard Deviation(X+Y)

59.16925

Portfolio Management Weight Assigned to X

0.3

Weight Assigned to Y

0.7

Portfolio Expected Return

27.5

Portfolio Risk

52.64266

Copyright ©2024 Pearson Education, Inc.


E(P) = $27.5  P = 52.64 CV 

P

E  P

52.64 100%  191.42% 27.5

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxxv 5.51 cont.

(b)

PHStat output: Probabilities & Outcomes:

P

Weight Assigned to X

X

Y

0.1

-100

50

0.3

0

150

0.3

80

-20

0.3

150

-100

0.5

Statistics E(X)

59

E(Y)

14

Variance(X)

6189

Standard Deviation(X) Variance(Y)

78.6702 9924

Standard Deviation(Y)

99.61928

Covariance(XY)

-6306

Variance(X+Y)

3501

Standard Deviation(X+Y)

59.16925

Portfolio Management Weight Assigned to X

0.5

Weight Assigned to Y

0.5

Portfolio Expected Return

36.5

Portfolio Risk

29.58462

Copyright ©2024 Pearson Education, Inc.


E(P) = $36.5  P = 29.59 CV 

P

E  P

29.59 100%  81.07% 36.5

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxxvii 5.51 cont.

(c)

PHStat output: Probabilities & Outcomes:

P

Weight Assigned to X

X

Y

0.1

-100

50

0.3

0

150

0.3

80

-20

0.3

150

-100

0.7

Statistics E(X)

59

E(Y)

14

Variance(X)

6189

Standard Deviation(X)

78.6702

Variance(Y)

9924

Standard Deviation(Y)

99.61928

Covariance(XY)

-6306

Variance(X+Y)

3501

Standard Deviation(X+Y)

59.16925

Portfolio Management Weight Assigned to X

0.7

Weight Assigned to Y

0.3

Portfolio Expected Return

45.5

Portfolio Risk

35.73863

E(P) = $45.5  P = 35.74 CV 

P

E  P

35.74 100%  78.55% 45.5

Copyright ©2024 Pearson Education, Inc.


(d)

Based on the results of (a)-(c), you should recommend a portfolio with 70% of stock X and 30% of stock Y because it has the lowest risk per unit average return.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxxix 5.52

PHStat output: Probabilities & Outcomes:

P

X

Y

0.1

-50

-100

0.3

20

50

0.4

100

130

0.2

150

200

Statistics E(X)

71

E(Y)

97

Variance(X)

3829

Standard Deviation(X)

61.87891

Variance(Y)

7101

Standard Deviation(Y)

84.26743

Covariance(XY)

5113

Variance(X+Y)

21156

Standard Deviation(X+Y)

5.53

(a) (b)

E(X) = $71E(Y) = $97  X = 61.88  Y = 84.27

(a)

PHStat output:

145.451

Copyright ©2024 Pearson Education, Inc.


Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxxxi 5.53

(a)

E(P) = $106.84  P = $ 87.7145 CV 

cont.

(b)

PHStat output:

E(P) = $94.4  P = $ 73.1575 CV  (c)

P

E  P

P

E  P

= 82.10%

= 77.50%

PHStat output:

E(P) = $81.96  P = $ 61.1439 CV 

P

E  P

= 74.60%

Copyright ©2024 Pearson Education, Inc.


5.53 cont.

(d)

5.54

(a)

(b) (c)

Based on the results of (a)-(c), you should recommend a portfolio with 70% of Black Swan fund and 30% of Good Times fund because it has the lowest risk per unit average return as measured by the coefficient of variation. PHStat output:

Let X = corporate bond fund, Y = common stock fund. E(X) = $66.2E(Y) = $63.01.  X = $57.2150  Y = $195.2172 According to the probability of 0.01, it is highly unlikely that you will lose $999 of every $1,000 invested.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxxxiii 5.55

(a)

PHStat output:

E(P) = $ 63.967  P = 153.2659 CV 

P

E  P

 239.60%

Copyright ©2024 Pearson Education, Inc.


5.55 cont.

(b)

PHStat output:

E(P) = $64.61  P = $125.4161 CV 

P

E  P

 194.13%

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxxxv 5.55 cont.

(c)

PHStat output:

E(P) = $65.24  P = $97.75455 CV  (d)

5.56

(a)

P

 149.83% E  P Since investing $700 in the corporate bond fund and $300 in the common stock fund has the lowest coefficient of variation at 149.83%, you should recommend this portfolio. PHStat output: Hypergeometric Probabilities Data Sample size

4

No. of successes in population

5

Population size

10

Hypergeometric Probabilities Table X

P(X)

3

0.238095

Copyright ©2024 Pearson Education, Inc.


 5 10  5  5  4  3! 5  4!    3 4  3  3! 2  1  4!1! 5 P  X  3       0.2381 10  9  8  7  6! 10 3 7     6! 4  3  2 1 4 

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxxxvii 5.56 cont.

(b)

PHStat output: Hypergeometric Probabilities Data Sample size

4

No. of successes in population

3

Population size

6

Hypergeometric Probabilities Table X

P(X)

1

0.2

2

0.6

3

0.2

 3   6  3  3  2! 3!     1 4  1  2! 1 3!  0! 1 P  X  1        0.2 6  5  4! 5 6   4!  2  1  4

(c)

Partial PHStat output: Hypergeometric Probabilities Data Sample size

5

No. of successes in population

3

Population size

12

Hypergeometric Probabilities Table X

P(X)

0

0.159091

 3  12  3  3! 9  8  7  6  5!     0 5  0 3!  0 5!  4  3  2  1 7     P  X  0     0.1591 12  11  10  9  8  7! 44 12    7! 5  4  3  2 1 5  Copyright ©2024 Pearson Education, Inc.


5.56 cont.

(d)

Partial PHStat output:

Hypergeometric Probabilities

Data Sample size

3

No. of successes in population

3

Population size

10

Hypergeometric Probabilities Table X

P(X)

3

0.008333

 3  10 – 3  3! 7!     3 3 – 3   3! 0! 7! 0!  1  0.0083 P( X  3)     10  9  8  7! 120 10    7! 3  2 1 3

5.57

5.58

(a)



nE  N  E  N  n nE 4  5 = = 2  = 0.8165 N 10 N2 N 1

(b)



nE  N  E  N  n nE 4  3 = = 2  = 0.6325 N 6 N2 N 1

(c)



nE  N  E  N  n nE 5  3 = = 1.25   = 0.7724 N 12 N2 N 1

(d)



nE  N  E  N  n nE 3  3 = = 0.9   = 0.7 N 10 N2 N 1

(a)

Partial PHStat outuput: Hypergeometric Probabilities Data Sample size

6

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems lxxxix No. of successes in population

25

Population size

100

Hypergeometric Probabilities Table X

P(X)

0

0.168918

1

0.361968

2

0.305888

3

0.130286

4

0.029448

5

0.003343

6

0.000149

Copyright ©2024 Pearson Education, Inc.


5.58 cont.

(a)

If n = 6, E = 25, and N = 100,   25  100  25   25  100  25         0 60  +  1  6  1  P(X  2) = 1 – [P(X = 0) + P(X = 1)] = 1 –       100   100        6  6   

(b)

= 1 – [0.1689 + 0.3620] = 0.4691 Partial PHStat output: Hypergeometric Probabilities Data Sample size

6

No. of successes in population

30

Population size

100

Hypergeometric Probabilities Table X

P(X)

0

0.109992

1

0.304593

2

0.33459

3

0.186438

4

0.05552

5

0.008368

6

0.000498

If n = 6, E = 30, and N = 100,

  30  100  30   30  100  30         0 6  0   1  6  1  P(X  2) = 1 – [P(X = 0) + P(X = 1)] = 1 –    +    100   100        6  6    (c)

= 1 – [0.1100 + 0.3046] = 0.5854 Partial PHStat output: Hypergeometric Probabilities

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xci Data Sample size

6

No. of successes in population

5

Population size

100

Hypergeometric Probabilities Table X

P(X)

0

0.729085

1

0.243028

2

0.026706

3

0.001161

4

1.87E-05

5

7.97E-08

Copyright ©2024 Pearson Education, Inc.


5.58 cont.

(c)

If n = 6, E = 5, and N = 100,   5  100  5   5  100  5         0 6  0   1  6  1   P(X  2) = 1 – [P(X = 0) + P(X = 1)] = 1 –    +    100   100        6  6   

(d)

= 1 – [0.7291 + 0.2430] = 0.0279 Partial PHStat output: Hypergeometric Probabilities Data Sample size

6

No. of successes in population

10

Population size

100

Hypergeometric Probabilities Table X

P(X)

0

0.522305

1

0.368686

2

0.096458

3

0.011826

4

0.000706

5

1.9E-05

6

1.76E-07

If n = 6, E = 10, and N = 100,

  10  100  10   10 100  10         0 6  0   1  6  1  P(X  2) = 1 – [P(X = 0) + P(X = 1)] = 1 –    +    100   100        6  6    (e)

= 1 – [0.5223 + 0.3687] = 0.1090 The probability that the entire group will be audited is very sensitive to the true number of improper returns in the population. If the true number is very low (E = 5), the probability is very low (0.0279). When the true number is increased by a factor of six Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xciii (E = 30), the probability the group will be audited increases by a factor of almost 21 (0.5854).

Copyright ©2024 Pearson Education, Inc.


5.59

Partial PHStat output:

(a) (b) (c) (d)

P(X = 0) = 0.4015 P(X  1) = 1 – 0.4015 = 0.5985 P(X  2) = P(X = 0) + P(X = 1) + P(X = 2) = 0.9758 Partial PHStat output:

P(X = 0) = 0.2701

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xcv 5.60

PHStat output: Data Sample size

4

No. of successes in population

4

Population size

30

Hypergeometric Probabilities Table

(a) (b) (c) (d)

5.61

X

P(X)

0

0.545521

1

0.379493

2

0.071155

3

0.003795

4

3.65E-05

P(X = 4) = 3.6490  10 5 P(X = 0) = 0.5455 P(X  1) = 0.4545 E=6 (a) P(X = 4) = 0.0005 (b) P(X = 0) = 0.3877 (c) P(X  1) = 0.6123

Partial PHStat output:

Copyright ©2024 Pearson Education, Inc.


(a) (b) (c) (d)

P(X = 0) = 0.0404 P(X  1) = 1 – P(X = 0) = 0.9596 P(X = 4) = 0.1318 P(X < 4) = 0.8296

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xcvii 5.62

Partial PHStat output:

(a) (b) (c) (d)

P(X = 1) = 0.2424 P(X  1) = 1 – P(X = 0) = 0.9697 P(X = 3) = 0.2424 Because the number of events of interest in the population is a smaller fraction of the population size in (c), the probability in (c) is smaller than that in Example 5.7

Copyright ©2024 Pearson Education, Inc.


Chapter 6

6.43

(a)

PHStat output: Exponential Probabilities

Data Mean

10

X Value

0.1

Results P(<=X)

0.6321

P(arrival time < 0.1)  1 – e –  x  1 – e –(10)(0.1)  0.6321 (b) (c)

P(arrival time > 0.1) = 1 – P(arrival time  0.1) = 1 – 0.6321 = 0.3679 PHStat output: Exponential Probabilities

Data Mean

10

X Value

0.2

Results P(<=X)

6.44

0.8647

(d)

P(0.1 < arrival time < 0.2) = P(arrival time < 0.2) – P(arrival time < 0.1) = 0.8647 – 0.6321 = 0.2326 P(arrival time < 0.1) + P(arrival time > 0.2) = 0.6321 + 0.1353 = 0.7674

(a)

PHStat output: Exponential Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems xcix Probabilities

Data Mean

30

X Value

0.1

Results P(<=X)

(b)

0.9502

P(arrival time < 0.1) = 1  e  x  1  e 30 0.1 = 0.9502 P(arrival time > 0.1) = 1 – P(arrival time  0.1) = 1 – 0.9502 = 0.0498

Copyright ©2024 Pearson Education, Inc.


6.44 cont.

(c)

PHStat output: Exponential Probabilities

Data Mean

30

X Value

0.2

Results P(<=X)

6.45

0.9975

(d)

P(0.1 < arrival time < 0.2) = P(arrival time < 0.2) – P(arrival time < 0.1) = 0.9975 – 0.9502 = 0.0473 P(arrival time < 0.1) + P(arrival time > 0.2) = 0.9502 + 0.0025 = 0.9527

(a)

PHStat output: Data Mean

5

X Value

0.3

Results P(<=X)

(b) (c)

0.7769

P(arrival time  0.3) = 1  e 5 0.3 = 0.7769 P(arrival time > 0.3) = 1 – P(arrival time < 0.3) = 0.2231 PHStat output: Data Mean

5

X Value

0.5

Results P(<=X)

0.9179 Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ci

6.46

(d)

P(0.3 < arrival time < 0.5) = P(arrival time < 0.5) – P(arrival time < 0.3) = 0.9179 – 0.7769 = 0.1410 P(arrival time < 0.3 or > 0.5) = 1 – P(0.3 < arrival time < 0.5) = 0.8590

(a)

PHStat output: Exponential Probabilities

Data Mean

50

X Value

0.05

Results P(<=X)

0.9179

P(arrival time  0.05)  1 – e –(50)(0.05) = 0.9179

Copyright ©2024 Pearson Education, Inc.


6.46 cont.

(b)

PHStat output: Exponential Probabilities

Data Mean

50

X Value

0.0167

Results P(<=X)

(c)

0.5661

P(arrival time  0.0167) = 1 – 0.4339 = 0.5661 PHStat output: Exponential Probabilities

Data Mean

60

X Value

0.05

Results P(<=X)

0.9502

Exponential Probabilities

Data Mean X Value

60 0.0167

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ciii Results P(<=X)

0.6329

If  = 60,P(arrival time  0.05) = 0.9502, P(arrival time  0.0167) = 0.6329

Copyright ©2024 Pearson Education, Inc.


6.46 cont.

(d)

PHStat output: Exponential Probabilities

Data Mean

30

X Value

0.05

Results P(<=X)

0.7769

Exponential Probabilities

Data Mean

30

X Value

0.0167

Results P(<=X)

0.3941

If  = 30,P(arrival time  0.05) = 0.7769 P(arrival time  0.0167) = 0.3941 6.47

(a)

PHStat output: Exponential Probabilities

Data Mean

2

X Value

1

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cv

Results P(<=X)

(b)

0.8647

P(arrival time  1) = 0.8647 PHStat output: Exponential Probabilities

Data Mean

2

X Value

5

Results P(<=X)

0.999955

P(arrival time  5) = 0.99996

Copyright ©2024 Pearson Education, Inc.


6.47 cont.

(c)

PHStat output: Exponential Probabilities

Data Mean

1

X Value

1

Results P(<=X)

0.6321

Exponential Probabilities

Data Mean

1

X Value

5

Results P(<=X)

0.993262

If  = 1,P(arrival time  1) = 0.6321, P(arrival time  5) = 0.9933 6.48

(a)

PHStat output: Exponential Probabilities

Data Mean

15

X Value

0.05

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cvii

Results P(<=X)

(b)

0.5276

P(arrival time  0.05)  1 – e –(15)(0.05)  0.5276 PHStat output: Exponential Probabilities

Data Mean

15

X Value

0.25

Results P(<=X)

0.9765

P(arrival time  0.25) = 0.9765

Copyright ©2024 Pearson Education, Inc.


6.48 cont.

(c)

PHStat output: Exponential Probabilities

Data Mean

25

X Value

0.05

Results P(<=X)

0.7135

Exponential Probabilities

Data Mean

25

X Value

0.25

Results P(<=X)

0.9981

If  = 25,P(arrival time  0.05) = 0.7135, P(arrival time  0.25) = 0.9981 6.49

(a)

PHStat output:

P(next call arrives in  3) = 0.4512 Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cix (b)

PHStat output:

P(next call arrives in  6) = 1 − 0.6988 = 0.3012

Copyright ©2024 Pearson Education, Inc.


6.49 cont.

(c)

PHStat output:

P(next call arrives in  1) = 0.1813 6.50

(a)

PHStat output: Exponential Probabilities

Data Mean

0.05

X Value

14

Results P(<=X)

(b)

0.5034

P(X  14)  1 – e –(1/ 20)(14)  0.5034 PHstat output: Exponential Probabilities

Data Mean

0.05

X Value

21

Results P(<=X)

0.6501

P(X > 21)  1  1 – e –(1/20)(21)   0.3499 Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxi (c)

PHStat output: Exponential Probabilities

Data Mean

0.05

X Value

7

Results P(<=X)

0.2953

P(X  7)  1 – e –(1/ 20)(7)  0.2953

Copyright ©2024 Pearson Education, Inc.


6.51

(a)

PHStat output: Exponential Probabilities

Data Mean

8

X Value

0.25

Results P(<=X)

(b)

0.8647

P(arrival time  0.25) = 0.8647 PHStat output: Exponential Probabilities

Data Mean

8

X Value

0.05

Results P(<=X)

(c)

0.3297

P(arrival time  0.05) = 0.3297 PHStat output: Exponential Probabilities

Data Mean

15

X Value

0.25

Results

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxiii P(<=X)

0.9765

Exponential Probabilities

Data Mean

15

X Value

0.05

Results P(<=X)

0.5276

If  = 15,P(arrival time  0.25) = 0.9765, P(arrival time  0.05) = 0.5276

Copyright ©2024 Pearson Education, Inc.


6.52

(a)

PHStat output: Exponential Probabilities

Data Mean

0.6944

X Value

1

Results P(<=X)

(b)

0.5006

P(X < 1) = 1  e  0.69441 = 0.5006 PHStat output: Exponential Probabilities

Data Mean

0.6944

X Value

2

Results P(<=X)

(c)

0.7506

P(X < 2) = 1  e  0.6944 2 = 0.7506 PHStat output: Exponential Probabilities

Data Mean X Value

0.6944 3

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxv Results P(<=X)

0.8755

P(X > 3) = 1  1 – e –(0.6944)(3)  = 0.1245 (d)

6.53

The time between visitors is similar to waiting line (queuing) where the exponential distribution is most appropriate.

n = 20  5 and n 1    = 80  5   n = 20 Partial PHStat output:

n = 100,  = 0.20 (a)

  n 1     4

Probability for a Range From X Value

24.5

To X Value

25.5

Z Value for 24.5

1.125

Z Value for 25.5

1.375

P(X<=24.5)

0.8697

P(X<=25.5)

0.9154

P(24.5<=X<=25.5)

0.0457

P(X = 25)  P(24.5  X  25.5) = P(1.125  Z  1.375) = 0.0457

Copyright ©2024 Pearson Education, Inc.


6.53 cont.

(b)

Partial PHStat output: Probability for X >

(c)

X Value

25.5

Z Value

1.375

P(X>25.5)

0.0846

P(X > 25) = P(X  26)  P(X  25.5) = P(Z  1.375) = 0.0846 Partial PHStat output: Probability for X <= X Value

25.5

Z Value

1.375

P(X<=25.5)

0.9154343

P(X  25)  P(X  25.5) = P(Z  1.375) = 0.9154 (d) Common Data Mean

20

Standard Deviation

4

Probability for X <= X Value

24.5

Z Value

1.125

P(X<=24.5)

0.8697055

P(X < 25) = P(X  24)  P(X  24.5) = P(Z  1.125) = 0.8697 6.54

n = 100, p = 0.40. n = 40  5 and n 1    = 60  5

  n = 40   n 1    = 4.8990 (a) Probability for a Range From X Value

39.5

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxvii To X Value

40.5

Z Value for 39.5

-0.102062

Z Value for 40.5

0.102062

P(X<=39.5)

0.4594

P(X<=40.5)

0.5406

P(39.5<=X<=40.5)

0.0813

P(X = 40)  P(39.5  X  40.5) = P(−0.1021  Z  0.1021) = 0.0813 (b) Probability for X > X Value

40.5

Z Value

0.1020616

P(X>40.5)

0.4594

P(X > 40) = P(X  41)  P(X  40.5) = P(Z  0.1021) = 0.4594

Copyright ©2024 Pearson Education, Inc.


6.54 cont.

(c) Probability for X <= X Value

40.5

Z Value

0.1020616

P(X<=40.5)

0.5406461

P(X  40)  P(X  40.5) = P(Z  0.1021) = 0.5406 (d) Probability for X <= X Value

39.5

Z Value

-0.102062

P(X<=39.5)

0.4593539

P(X < 40) = P(X  39)  P(X  39.5) = P(Z  –0.1021) = 0.4594 6.55

n = 10, p = 0.50. n = 5  5 and n 1    = 5  5

  n = 5   n 1    = 1.5811 PHStat output: X

(a)

P(X)

P(<=X)

P(<X)

P(>X)

0

0.000977

0.000977

0

0.999023

1

1

0.009766

0.010742

0.000977

0.989258

0.999023

2

0.043945

0.054688

0.010742

0.945313

0.989258

3

0.117188

0.171875

0.054688

0.828125

0.945313

4

0.205078

0.376953

0.171875

0.623047

0.828125

5

0.246094

0.623047

0.376953

0.376953

0.623047

6

0.205078

0.828125

0.623047

0.171875

0.376953

7

0.117188

0.945313

0.828125

0.054687

0.171875

8

0.043945

0.989258

0.945313

0.010742

0.054687

9

0.009766

0.999023

0.989258

0.000977

0.010742

10

0.000977

1

0.999023

0

0.000977

P(X = 4) = 0.2051 Copyright ©2024 Pearson Education, Inc.

P(>=X)


Solutions to End-of-Section and Chapter Review Problems cxix (b) (c) (d)

P(X  4) = 0.8281 P(4  X  7) = 0.9453 – 0.1719 = 0.7734 (a) Probability for a Range From X Value

3.5

To X Value

4.5

Z Value for 3.5

-0.948707

Z Value for 4.5

-0.316236

P(X<=3.5)

0.1714

P(X<=4.5)

0.3759

P(3.5<=X<=4.5)

0.2045

P(X = 4)  P(3.5  X  4.5) = P(–0.9487  Z  –0.3162) = 0.2045

Copyright ©2024 Pearson Education, Inc.


6.55 cont.

(d)

(b) Probability for X > X Value

3.5

Z Value

-0.948707

P(X>3.5)

0.8286

P(X  4)  P(X  3.5) = P(Z  –0.9487) = 0.8286 (c) Probability for a Range From X Value

3.5

To X Value

7.5

Z Value for 3.5

-0.948707

Z Value for 7.5

1.581178

P(X<=3.5)

0.1714

P(X<=7.5)

0.9431

P(3.5<=X<=7.5)

0.7717

P(4  X  7)  P(3.5  X  7.5) = P(–0.9487  Z  1.5812) = 0.7717 6.56

1    0.3333 , n = 150, n = 50 > 5, n 1    = 100 > 5 3 (a)

(b) (c) (d) 6.57

 X  n  59.5  150  0.3333  P(X  60) = P    n 1    150 0.3333 1  0.3333      = P(Z > 1.6454) = 0.0499 P(X = 60) = P( 59.5  X a  60.5) = P(1.6454  Z  1.8187) = 0.0155 P(X < 60) = P( X a  59.5) = P(Z  1.6454) = 0.9501 P(X = 71) = P( 70.5  X a  71.5) = P(3.5507  Z  3.7239) = 0.0001

  0.5 , n = 55, n = 27.5 > 5, n 1    = 27.5 > 5 (a) (b)

 X  n 37.5  27.5   = P(Z > 2.6968) = 0.0035 P(X  38) = P    n 1    3.7081    The results are virtually the same as those in Problem 5.43 (a).

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxxi

Chapter 7

7.36

7.37

N n 80  10   0.9413 N 1 80  1 N n 400  100 N n 900  200   0.8671   0.8824 N 1 400  1 N 1 900  1 A sample of size 100 selected without replacement from a population a population of size 400 has a greater effect in reducing the standard error.

7.38

Whenever a sample is obtained with replacement, no finite population correction factor is needed.

7.39

  1.30,   0.04 n 16   0.05 and the sample is selected without replacement, we need to perform the N 200 finite population correction.  N n  X    1.3 X   0.0096 n N 1 PHstat output: Since

Common Data Mean

1.3

Standard Deviation

0.0096

Probability for a Range From X Value

1.31

To X Value

1.33

Z Value for 1.31

1.041667

Z Value for 1.33

3.125

P(X<=1.31)

0.8512

P(X<=1.33)

0.9991

P(1.31<=X<=1.33)

0.1479

P(1.31 < X < 1.33) = P(1.0417< Z < 3.125) = 0.1479 7.40

  3.1,   0.40 Copyright ©2024 Pearson Education, Inc.


Even though the sample is selected without replacement, we do not need to perform the finite n 16   0.05. However, if the finite population correction is population correction since N 500 performed, the answer will be:  N n  X    3.1  X   0.0985 n N 1 PHstat output: Common Data Mean Standard Deviation

Find X and Z Given Cum. Pctage. 3.1 0.0985

Cumulative Percentage

85.00%

Z Value

1.036433

X Value

3.202089

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxxiii 7.40 cont. Probability for X > X Value

3

Z Value

-1.015228

P(X>3)

0.8450

(a) (b)

7.41

P( X > 3) = P(Z > 1.0152) = 0.8450 P( X < A) = P(Z < 1.0364) = 0.85 A = 1.0364 (0.0985) + 3.1 = 3.20 minutes

 = 0.10 n 400   0.05 and the sample is selected without replacement, we need to perform the N 5000 finite population correction.  1    N  n  p  0.1 p   0.0144 n N 1 PHstat output: Since

Common Data

Probability for a Range

Mean

0.1

Standard Deviation

0.0144

Probability for X <=

0.09

To X Value

0.1

Z Value for 0.09

-0.694444

Z Value for 0.1

0

X Value

0.08

P(X<=0.09)

0.2437

Z Value

-1.388889

P(X<=0.1)

0.5000

P(X<=0.08)

0.0824333

P(0.09<=X<=0.1)

0.2563

(a) (b) 7.42

From X Value

P(0.09 <  < 0.10) = P(0.6944 < Z < 0) = 0.2563 P(  < 0.08) = P(Z <1.3889) = 0.0824

 = 0.93 n 500   0.05 and the sample is selected without replacement, we will perform the N 10000 finite population correction.  1    N  n  p  0.93 p   0.0111 n N 1 PHstat output: Since

Copyright ©2024 Pearson Education, Inc.


Common Data Mean Standard Deviation

Probability for a Range 0.93

From X Value

0.93

0.0111

To X Value

0.95

Probability for X >

Z Value for 0.93

0

Z Value for 0.95

1.801802

X Value

0.95

P(X<=0.93)

0.5000

Z Value

1.8018018

P(X<=0.95)

0.9642

P(0.93<=X<=0.95)

0.4642

P(X>0.95)

(a) (b)

0.0358

P(0.93 < p < 0.95) = P(0 < Z < 1.8018) = 0.4642 P(p > .95) = P(Z >1.8018) = 0.0358

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxxv

Chapter 8

8.70

N  X  N t 

S n

N –n 7.8 500 – 25  500  25.7  500  2.7969   N –1 500 –1 25

$10,721.53  Population Total  $14,978.47 8.71

Using PHStat Confidence Interval Estimate for the Total Difference

Data Population Size

10000

Sample Size

200

Confidence Level

95%

Intermediate Calculations Sum of Differences

200.63

Average Difference in Sample

1.00315

Total Difference

10031.5

Standard Deviation of Differences

4.998502

FPC Factor

0.989999

Standard Error of the Total Diff.

3499.126

Degrees of Freedom

199

t Value

1.971957

Interval Half Width

6900.128

Confidence Interval

Copyright ©2024 Pearson Education, Inc.


Interval Lower Limit

3131.37

Interval Upper Limit

16931.63

N n 4.9985 10,000  200  10,000  1.00315  10,000  1.972   10,000  1 n N 1 200 3131.37  Total Difference in the Population  16931.63 N  D  N t 

8.72

(a)

SD

pZ

p (1  p ) n

 N  n   0.04+1.2816  0.04(1  0.04)  5000  300  300  N  1  5000  1

  0.05406 (b)

pZ

p(1  p) n

 N  n   0.04+1.645  0.04(1  0.04)  5000  300  300  N  1  5000  1

  0.05804 (c)

pZ

p (1  p ) n

 N  n   0.04+2.3263  0.04(1  0.04)  5000  300  300  N  1  5000  1

  0.06552 8.73

8.74

8.75

8.76

8.77

8.78

S N n $0.44 1000  100 = 1000  $2.55  1000  1.9842   1000  1 n N 1 100 $ 2,467.13  Population Total  $ 2,632.87 N  X  N t 

S N n $138.8046 3000  10  3000  $261.40  3000 1.8331   3000  1 n N 1 10 $543,176.96  Population Total  $1,025,223.04 N  X  N t 

S N n $93.67 1546  50  1546  $252.28  1546  2.0096   1546  1 n N 1 50 $349,526.64  Population Total  $430,523.12 N  X  N t 

N –n $29.5523 4000  150  4000  $7.45907  4000  2.6092   4000  1 n N 1 150 $ 5,126.26  Total Difference in the Population  $54,546.28 Note: The t-value of 2.6092 for 99% confidence and d.f. = 149 was derived on Excel. N  D  N t 

SD

N –n $25.2448 1200  120  1200  ($0.9583)  1200  1.9801  1200  1 n N 1 120 –$4,046.99  Total Difference in the Population  $6,346.99 N  D  N t 

(a)

SD

p(1  p) N  n 0.0367(1  0.0367) 10000  300  0.0367+1.645  n N 1 300 10000  1   0.0542 pZ

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxxvii (b)

Since the upper bound is higher than the tolerable exception rate of 0.04, the auditor should request a larger sample.

(b)

p(1  p) N  n 0.024(1  0.024) 5000  500   0.0347  0.024+1.645  n N 1 500 5000  1 With 95% level of confidence, the auditor can conclude that the rate of noncompliance is less than 0.0347, which is less than the 0.05 tolerable exception rate for internal control, and, hence, the internal control compliance is adequate.

8.80

X  t n 1

S n

8.81

Z 2 2 1.96  20  n0  2   61.4633 e 52

8.79

pZ

(a)

N n 24 = 75  2.0301 N 1 36

2

n

200  36 67.63    82.37 200  1

2

61.46331000  n0 N   57.9589 n0   N  1 61.4633  1000  1

Copyright ©2024 Pearson Education, Inc.

Use n = 58


8.82

(a)

N n 100 2000  50  350  1.96 n N 1 50 2000  1 322.6238    377.3762 X Z

Z 2 2 1.96 100  n0  2   96.0364 e 202 96.0364  2000  n0 N n   91.6799 Use n = 92 n0   N  1 96.0364   2000  1 2

2

(b)

(c)

N n 100 1000  50  350  1.96 n N 1 50 1000  1 322.9703    377.0297

(a) X  Z

Z 2 2 1.96 100  (b) n0  2   96.0364 e 202 96.0364 1000 n0 N n   87.7015 Use n = 88 n0   N  1 96.0364  1000  1 2

Z 2 2 1.96  400  n0  2   245.8531 e 502 245.8531 3000  n0 N n   227.3013 Use n = 228 n0   N  1 245.8531   3000  1 2

8.83

8.84

(a)

(b)

8.85

2

2

p 1  p  N  n 0.3 1  0.3 1000  100  0.3  1.6449 n N 1 100 1000  1 0.2285    0.3715 Z 2 p 1  p  1.64492  0.31  0.3 n0    227.2656 e2 0.052 227.2656 1000  n0 N n   185.3315 Use n = 186 n0   N  1 227.2656  1000  1 pZ

p 1  p  N  n 0.3 1  0.3 2000  100  0.3  1.6449 n N 1 100 2000  1 0.2265    0.3735 Z 2 p 1  p  1.64492  0.31  0.3 (b) n0    227.2656 e2 0.052 227.2656  2000  n0 N n   204.1676 Use n = 205 n0   N  1 227.2656   2000  1

(c)

(a) p  Z

(a)

p 1  p  N  n 0.411  0.41 4000  200  0.41  1.96 n N 1 200 4000  1 0.3436    0.4764 pZ

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxxix 8.85

n0 

(c)

(a) p  Z

(a)

X Z

cont.

8.86

Z 2 p 1  p 

(b)

1.962  0.411  0.41

  1486.7964 e2 0.0252 1486.7964  4000  n0 N n   1084.1062 Use n = 1085 n0   N  1 1486.7964   4000  1

p 1  p  N  n 0.411  0.41 6000  200  0.41  1.96 n N 1 200 6000  1 0.3430    0.4770 Z 2 p 1  p  1.962  0.411  0.41 (b) n0    1486.7964 e2 0.0252 1486.7964  6000  n0 N n   1191.6940 Use n = 1192 n0   N  1 1486.7964   6000  1

 n

N n 0.05 2000  100 1.9804    2.0000  1.99  1.96 N 1 2000  1 100

Z  2 1.96  0.05   96.0364 e2 0.012 96.0364  2000  n0 N n   91.6799 Use n = 92 n0   N  1 96.0364   2000  1 2

2

2

(b)

n0 

(c)

(a) X  Z

N n 0.05 1000  100  1.99  1.96 n N 1 100 1000  1 1.9807    1.9993 Z 2 2 1.96  0.05   96.0364 e2 0.012 96.0364 1000 n0 N n   87.7015 Use n = 88 n0   N  1 96.0364  1000  1 2

2

(b) n0 

8.87

(a)

X  t n 1

S n

N n 0.44 300  20 = 2.55  2.0930 $2.35    $2.75 N 1 20 300  1

(b)

X  t n 1

S n

N n 0.44 500  20 = 2.55  2.0930 $2.35    $2.75 N 1 20 500  1

Copyright ©2024 Pearson Education, Inc.


Chapter 9

9.80

H 0 :   7 , H1 :   7 ,   0.05 , n  16 ,   0.2

    .2  Lower critical value: Z L  1.6449 , X L    Z L    7  1.6449    6.9178  n  16  X  1 6.9178  6.9 (a) Z STAT  L   0.3551  .2 n 16 power = 1    P  X  X L   P  Z  0.3551  0.6388

  1  0.6388  0.3612 (b)

Z STAT 

X L  1

n

6.9178  6.8  2.3551 .2 16

power = 1    P  X  X L   P  Z  2.3551  0.9907

  1  0.9907  0.0093 9.81

H 0 :   7 , H1 :   7 ,   0.01 , n  16 ,   0.2

    .2  Lower critical value: Z L  2.3263 , X L    Z L    7  2.3263   6.8837  n  16  X  1 6.8837  6.9 (a) Z STAT  L   0.3263  .2 n 16 power = 1    P  X  X L   P  Z  0.3263  0.3721

  1  0.3721  0.6279 (b)

Z STAT 

X L  1

n

6.8837  6.8  1.6737 .2 16

power = 1    P  X  X L   P  Z  1.6737   0.9529 (c)

9.82

  1  0.9529  0.0471 Holding everything else constant, the greater the distance between the true mean and the hypothesized mean, the higher the power of the test will be and the lower the probability of committing a Type II error will be. Holding everything else constant, the smaller the level of significance, the lower the power of the test will be and the higher the probability of committing a Type II error will be.

H 0 :   7 , H1 :   7 ,   0.05 , n  25 ,   0.2

    .2  Lower critical value: Z L  1.6449 , X L    Z L    7  1.6449    6.9342  n  25  Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxxxi 9.82

(a)

Z STAT 

X L  1

n

6.9342  6.9  0.8551 .2 25

power = 1    P  X  X L   P  Z  0.8551  0.8038

cont.

  1  0.8038  0.1962 (b)

Z STAT 

X L  1

n

6.9342  6.8  3.3551 .2 25

power = 1    P  X  X L   P  Z  3.3551  0.9996 (c)

9.83

  1  0.9996  0.0004 Holding everything else constant, the larger the sample size, the higher the power of the test will be and the lower the probability of committing a Type II error will be.

H 0 :   25,000 , H1 :   25,000 ,   0.05 , n  100 ,   3500 Lower critical value: Z L  1.6449 ,

    3,500  X L    ZL    25,000  1.6449    24,424.3013  n  100  X  1 24, 424.3012  24,000 (a) Z STAT  L   1.2123  3500 n 100

power = 1    P  X  X L   P  Z  1.2123  0.8873

  1  0.8873  0.1127 (b)

Z STAT 

X L  1

n

24, 424.3012  24,900  1.3591 3500 100

power = 1    P  X  X L   P  Z  1.3591  0.0871

  1  0.0871  0.9129 9.84

H 0 :   25,000 vs. H1 :   25,000 ,   0.01 , n  100 ,   3500 Lower critical value: Z L  2.3263 ,

    3,500  X L    ZL    25,000  2.3263    24,185.7786  n  100  X  1 24,185.7786  24,000 Z STAT  L   0.5308 (a)  3500 n 100

power = 1    P  X  X L   P  Z  0.5308   0.7022

  1  0.7022  0.2978

Copyright ©2024 Pearson Education, Inc.


9.84

(b)

Z STAT 

X L  1

n

power = 1    P  X  X L   P  Z  2.0406   0.0206

cont. (c)

9.85

24,185.7786  24,900  2.0406 3500 100

  1  0.0206  0.9794 Holding everything else constant, the greater the distance between the true mean and the hypothesized mean, the higher the power of the test will be and the lower the probability of committing a Type II error will be. Holding everything else constant, the smaller the level of significance, the lower the power of the test will be and the higher the probability of committing a Type II error will be.

H 0 :   25,000 , H1 :   25,000 ,   0.05 , n  25 ,   3500 Lower critical value: Z L  1.6449 ,

    3,500  X L    ZL    25,000  1.6449    23,848.6026  n  25  X  1 23,848.6026  24,000 (a) Z STAT  L   0.2163  3500 n 25

power = 1    P  X  X L   P  Z  0.2163  0.4144

  1  0.4144  0.5856 (b)

Z STAT 

X L  1

n

23,848.6026  24,900  1.5020 3500 25

power = 1    P  X  X L   P  Z  1.5020   0.0665 (c)

9.86

  1  0.0665  0.9335 Holding everything else constant, the larger the sample size, the higher the power of the test will be and the lower the probability of committing a Type II error will be.

H 0 :   25,000 , H1 :   25,000 ,   0.05 , n  100 ,   3500 Critical values: Z L  1.960 , ZU  1.960

    3,500  X L    ZL    25,000  1.960    24,314.0130  n  100      3,500  X U    ZU    25,000  1.960    25,685.9870  n  100  (a)

  P  X L  X  X U   P  0.8972  Z  4.8171  0.1848 power = 1    1  0.1848  0.8152

(b)

  P  X L  X  X U   P  1.6742  Z  2.2457   0.9406 power = 1    1  0.9406  0.0594

(c)

A one-tail test is more powerful than a two-tail test, holding everything else constant.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxxxiii

Chapter 11

11.43

(a) (b) (c) (d)

df A = c – 1 = 5 – 1 = 4 df BL = r – 1 = 7 – 1 = 6 df E = (r – 1)(c – 1) = (7 – 1)( 5 – 1) = 24 df T = rc – 1 = 7  5 – 1 = 34

11.44

(a)

SSE = SST – SSA – SSBL = 210 – 60 – 75 = 75 SSA 60 MSA   = 15 c –1 4 SSBL 75 MSBL   = 12.5 r –1 6 SSE 75 = 3.125 MSE   (r – 1)  (c – 1) 6  4 MSA 15 FSTAT    4.80 MSE 3.125 MSBL 12.5 FSTAT    4.00 MSE 3.125

(b)

(c) (d) 11.45

(a)

(b)

(c)

11.46

(a)

Source

Df

SS

MS

F

Among groups

4

60

15

4.80

Among blocks

6

75

12.5

4.00

Error

24

75

3.125

Total

34

210

For testing the treatment means: Decision rule: If FSTAT > 2.78, reject H0. Decision: Since FSTAT = 4.80 is greater than the upper critical bound of 2.78, reject H0. For testing the block means: Decision rule: If FSTAT > 2.51, reject H0. Decision: Since FSTAT = 4.00 > 2.51, reject H0. There is enough evidence of a difference due to blocks.

(b)

There are 5 degrees of freedom in the numerator and 24 degrees of freedom in the denominator. Q = 4.17

(c)

critical range  Q

MSE 3.125  4.17  2.786 r 7

Copyright ©2024 Pearson Education, Inc.


11.47

(a) (b) (c) (d)

df A = c – 1 = 3 – 1  2 df BL = r – 1  7 – 1  6 df E = (r – 1)(c – 1)  (7 – 1)( 3 – 1)  12 df T = rc – 1  7  3 – 1 = 20

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxxxv 11.48

MSA 18  =3 F 6 SSE =(MSE)(df E) = (3)(12) = 36 SSBL = (F)(MSE)(df BL) = (4)(3)(6) = 72 SST = SSA + SSBL + SSE = 36 + 72 + 36 = 144 Since FSTAT = 6 < F0.01,2,12 = 6.9266, do not reject the null hypothesis of no treatment effect. There is not enough evidence to conclude there is a treatment effect. Since FSTAT = 4.0 < F0.01,6,12 = 4.821, do not reject the null hypothesis of no block effect. There is not enough evidence to conclude there is a block effect. MSE 

(a) (b) (c) (d)

11.49 Source

df

SS

Among groups

4–1=3

3 x 80 = 240

MS

F 80  15.4286 =

80 Among blocks

5.185

540  7 = 77.1429

8–1=7 540

11.50

77.1429  5 = 15.4286

Error

3 x 7 = 21

15.4286 x 21= 324

Total

32 – 1 = 31

240 + 540 + 324 = 1104

(a)

(b)

11.51

5.000

Decision rule: If FSTAT > 3.07, reject H0. Decision: Since FSTAT = 5.185 is greater than the critical bound 3.07, reject H0. There is enough evidence to conclude that the treatment means are not all equal. Decision rule: If FSTAT > 2.49, reject H0. Decision: Since FSTAT = 5.000 is greater than the critical bound 2.49, reject H0. There is enough evidence to conclude that the block means are not all equal.

H0:  A   B  C   D H1: At least one mean differs. Decision rule: If FSTAT > 4.718, reject H0. Anova: Two-Factor Without Replication Source of Variation

SS

df

MS

F

P-value

F crit

Rows

153.2222

8

19.15278

19.06452

1.21E-08

3.362857

Columns

79.63889

3

26.5463

26.42396

8.86E-08

4.718061

Error

24.11111

24

1.00463

Total

256.9722

35

Test statistic: FSTAT = 26.42 Copyright ©2024 Pearson Education, Inc.


Decision: Since FSTAT = 26.42 is greater than the critical bound 4.718, reject H0. There is adequate evidence to conclude that there is a difference in the mean summed ratings of the four brands of Colombian coffee. MSE 1.0046 From Table E.10, Q = 3.9.Critical range = Q = 1.303  3.9 r 9 Pairs of means that differ at the 0.05 level are marked with * below. X A  X B = 1.56* X A  X C = 0.89 X A  X D = 2.56*

X B  X C = 2.45* X B  X D = 4.12* X C  X D = 1. 67* Brand B is rated highest with a sample mean rating of 25.56.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxxxvii 11.52

(a)

H0: .1  .2 where 1 = Internet Service, 2 = TV Service H1: Not all . j are equal where j = 1, 2 From PHStat ANOVA Source of Variation

SS

df

MS

F

P-value

Rows

1067.4750

19

56.1829

11.8247

0.0000 2.1683

Columns

265.2250

1 265.2250

55.8214

0.0000 4.3807

Error

90.2750

19

Total

1422.9750

39

4.7513

Level of significance

(b)

F crit

0.05

FSTAT = 55.8214. Since the p-value is virtually 0 < 0.05, reject H0. There is evidence of a difference in the mean rating between Internet Service and TV Service. PHStat output for the Tukey procedure:

The mean rating for the two services are significantly different from each other with Internet Service at the lowest, followed by TV Service. 11.53

(a)

H0: .1  .2  .3  .4 where 1 = Publix, 2 = Winn-Dixie, 3 = Target, 4 = Walmart H1: Not all . j are equal where j = 1, 2, 3, 4 Excel Output:

Copyright ©2024 Pearson Education, Inc.


(b)

FSTAT = 3.6831. Since p-value = 0.0147 < 0.05, reject H0. There is evidence of a difference between the mean price of these items at the four supermarkets. The assumptions needed are: (i) samples are randomly and independently drawn, (ii) populations are normally distributed, (iii) populations have equal variances and (iv) no interaction effect between treatments and blocks.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxxxix 11.53 cont.

(c)

(d)

11.54

(a)

Excel output for the Tukey procedure:

Using Q = 3.69 for numerator d.f. = 4 and denominator d.f. = 120, the mean price of items at Walmart differs from that of Winn-Dixie at 5% level of significance. H 0 : 1.   2.   33. H1 : Not all  i . are equal where i = 1, 2, …, 33 FSTAT =18.5723 and p-value is essentially 0. Reject H0. There is evidence of a significant block effect in this experiment. The blocking has been advantageous in reducing the experimental error. H0: .1  .2  .3 where 1 = One Year CD, 2 = Two Year CD, 3 = Five Year CD H1: Not all . j are equal where j = 1, 2, 3 Excel output: ANOVA Source of Variation

(b)

SS

df

MS

F

P-value

F crit

Rows

39.4541

37

1.0663

34.1775

0.0000 1.7295

Columns

1.1210

1

1.1210

35.9285

0.0000 4.1055

Error

1.1544

37

0.0312

Total

41.7295

75

FSTAT = 35.9285. Since the p-value is virtually 0, reject H0. There is evidence of a difference in the mean rates for these investments. The assumptions needed are: (i) samples are randomly and independently drawn, (ii) populations are normally distributed, (iii) populations have equal variances and (iv) no interaction effect between treatments and blocks.

Copyright ©2024 Pearson Education, Inc.


11.54 cont.

(c)

(d)

11.55

Excel output of the Tukey procedure:

Using Q = 3.40 for numerator d.f. = 3 and denominator d.f. = 60, the mean rates of these investments are not all different with One-Year CD being the lowest, followed by Two-Year CD and finally Five-Year CD. H0: 1.   2.   16. H1: Not all  i . are equal where i  1, 2, , 16 FSTAT = 34.1775. Since the p-value is virtually 0, reject H0. There is enough evidence of a significant block effect in this experiment. The blocking has been advantageous in reducing the experimental error.

To test at the 0.01 level of significance whether there is any difference in the mean thickness of the wafers for the five positions, you conduct an F test: H0: 1  2  3   4  5 where 1 = position 1, 2 = position 2, 3 = position 18, 4 = position 19, 5 = position 28 H1: At least one mean is different. Decision rule: df: 4, 116. If FSTAT > 3.4852, reject H0. ANOVA Source of Variation Rows

SS

df

MS

F

P-value

F crit

601.5

29

20.74138

5.922219

1.93E-12

1.878497

Columns

1417.733

4

354.4333

101.2002

6.84E-37

3.485212

Error

406.2667

116

3.502299

Total

2425.5

149

Test statistic: FSTAT  101.2 Decision: Since FSTAT  101.2 is greater than the critical bound of 3.4852, reject H0. There is enough evidence to conclude that the means of the thickness of the wafers are different across the five positions. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxli To determine which of the means are significantly different from one another, you use the TukeyKramer procedure to establish the critical range: Q = 4.71 critical range  Q

MSE 3.5023  1.609  4.71 r 30

Copyright ©2024 Pearson Education, Inc.


11.55 cont.

Pairs of means that differ at the 0.01 level are marked with * below. X 1  X 2 = 2.2* X 1  X 3 = 5.533* X 1  X 4 = 8.567*

X 1  X 5 = 6.533* X 2  X 3 = 3.333* X 2  X 4 = 6.367* X 2  X 5 = 4.333* X 3  X 4 = 3.033* X 3  X 5 = 1 X 4  X 5 = 2.033* At 1% level of significance, the F test concludes that there are significant differences in the mean thickness of the wafers among the 5 positions. The Tukey-Kramer multiple comparison test reveals that the mean thickness between all the pairs are significantly different with only the exception of the pair between position 18 and position 28.

11.56

(a)

H0: 1   2  3 where 1 = 2 days, 2 = 7 days, 3 = 28 days H1: At least one mean differs. Decision rule: If FSTAT > 3.114, reject H0. ANOVA Source of Variation

(b)

SS

df

MS

F

P-value

F crit

Rows

21.17006

39

0.542822

5.752312

2.92E-11

1.553239

Columns

50.62835

2

25.31417

268.2556

1.09E-35

3.113797

Error

7.360538

78

0.094366

Total

79.15894

119

Test statistic: F = 268.26 Decision: Since FSTAT = 268.26 is greater than the critical bound 3.114, reject H0. There is enough evidence to conclude that there is a difference in the mean compressive strength after 2, 7 and 28 days. MSE 0.0944  3.4 From Table E.10, Q = 3.4. critical range = Q = 0.1651 r 40

X 1  X 2 = 0.5531* X 1  X 3 = 1.5685* X 2  X 3 = 1.0154*

(c)

At the 0.05 level of significance, all of the comparisons are significant. This is consistent with the results of the F-test indicating that there is significant difference in the mean compressive strength after 2, 7 and 28 days. (r  1) MSBL  r (c  1) MSE 39  0.5428  40  2  0.0943 RE   = 2.558 (rc  1) MSE 119  0.0943

(d)

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxliii Box-and-whisker Plot

28 days

Seven days

Two days

0

(e)

1

2

3

4

5

6

The compressive strength of the concrete increases over the 3 time periods.

Copyright ©2024 Pearson Education, Inc.


Chapter 12

12.60

(a)

H 0 :  1   2 H1 :  1   2 where 1 = group 1, 2 = group 2 Decision rule: If ZSTAT < –1.96 or ZSTAT > 1.96, reject H0. B C 25  16 Test statistic: Z STAT  = 1.4056  BC 25  16 Decision: Since ZSTAT = 1.4056 is between the critical bounds of –1.96 and the upper critical bound of 1.96, do not reject H0. There is not enough evidence of a difference between group 1 and group 2.

12.61

(a)

H 0 :  1   2 H1 :  1   2 where 1 = beginning, 2 = end Decision rule: If ZSTAT < –1.645, reject H0. B C 9  22 Test statistic: Z STAT  = –2.3349  BC 9  22 Decision: Since ZSTAT = –2.3349 < –1.645, reject H0. There is enough evidence to conclude that the proportion of coffee drinkers who prefer Brand A is lower at the beginning of the advertising campaign than at the end of the advertising campaign. p-value = 0.0098. The probability of obtaining a data set which gives rise to a test statistic smaller than –2.3349 is 0.0098 if the proportion of coffee drinkers who prefer Brand A is not lower at the beginning of the advertising campaign than at the end of the advertising campaign.

(b)

12.62

(a)

(b)

12.63

(a)

(b)

H 0 :  1   2 H1 :  1   2 where 1 = prior, 2 = after Decision rule: If Z < –2.5758 or Z > 2.5758, reject H0. B C 21  36 Test statistic: Z STAT  = –1.9868  BC 21  36 Decision: Since ZSTAT = –1.9868 is in between the two critical bounds, do not reject H0. There is not enough evidence to conclude there is a difference in the proportion of voters who favored Candidate A prior to and after the debate. p-value = 0.0469. The probability of obtaining a sample which gives rise to a test statistic that differs from 0 by –1.9868 or more in either direction is 0.0469 if there is not a difference in the proportion of voters who favor Candidate A prior to and after the debate. H 0 :  1   2 H1 :  1   2 where 1 = before, 2 = after Decision rule: If ZSTAT < –1.645, reject H0. B C 5  15 Test statistic: Z STAT  = – 2.2361  BC 5  15 Decision: Since ZSTAT = –2.2361 < –1.645, reject H0. There is enough evidence to conclude that the proportion who prefer Brand A is lower before the advertising than after the advertising. p-value = 0.0127. The probability of obtaining a data set which gives rise to a test statistic smaller than –2.2361 is 0.0127 if the proportion who prefer Brand A is not lower before the advertising than after the advertising. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxlv

12.64

(a)

12.64

(a)

cont. (b)

12.65

(a)

(b)

12.66

12.67

H 0 :  1   2 H1 :  1   2 where 1 = last year, 2 = now Decision rule: If ZSTAT < –1.645, reject H0. B C 5  20 Test statistic: Z STAT  =–3  BC 5  20 Decision: Since ZSTAT = – 3 < –1.645, reject H0. There is enough evidence to conclude that satisfaction was lower last year prior to introduction of Six Sigma management. p-value = 0.0014. The probability of obtaining a data set which gives rise to a test statistic smaller than –3 is 0.0014 if the satisfaction was not lower last year prior to introduction of Six Sigma management.

H 0 :  1   2 H1 :  1   2 where 1 = year 1, 2 = year 2 Decision rule: If ZSTAT < –1.645, reject H0. B C 4  25 Test statistic: Z STAT  = –3.8996  BC 4  25 Decision: Since ZSTAT = –3.8996 < –1.645, reject H0. There is enough evidence to conclude that the proportion of employees absent less than 5 days was lower in year 1 than in year 2. p-value is virtually zero. The probability of obtaining a data set which gives rise to a test statistic smaller than –3.8996 is virtually zero if the proportion of employees absent less than 5 years was not lower in year 1 than in year 2.

(a)

For df = 25 and  = 0.01,  2/2 = 10.520 and  12 /2 = 46.928.

(b)

For df = 16 and  = 0.05,  2/2 = 6.908 and  12 /2 = 28.845.

(c)

For df = 13 and  = 0.10,  2/2 = 5.892 and  12 /2 = 22.362.

(a)

For df = 23 and  = 0.01,  2/2 = 9.2604 and  12 /2 = 44.1814.

(b)

For df = 19 and  = 0.05,  2/2 = 8.9065 and  12 /2 = 32.8523.

(c)

For df = 15 and  = 0.10,  2/2 = 7.2609 and  12 /2 = 24.9958.

(n – 1)  S 2

24 150 2 = 54 100 2

15 10 2 = 10.417 12 2

12.68

2  STAT 

12.69

2  STAT 

12.70

df = n – 1 = 16 – 1 = 15

12.71

(a)

For df = 15 and  = 0.05,  2/2 = 6.262 and  12 /2 = 27.488.

(b)

For df = 15 and  = 0.05,  2/2 = 7.261.

(a)

If H1 :   12 , do not reject H0 since the test statistic  2 = 10.417 falls between the two

12.72

2 (n – 1)  S 2

2

critical bounds,  2/2 = 6.262 and  12 /2 = 27.488. Copyright ©2024 Pearson Education, Inc.


(b)

If H1 :   12 , do not reject H0 since the test statistic  2 = 10.417 is greater than the critical bound 7.261.

12.73

You must assume that the data in the population are normally distributed to be able to use the chisquare test of a population variance or standard deviation. If the data selected do not come from an approximately normally distributed population, particularly for small sample sizes, the accuracy of the test can be seriously affected.

12.74

(a)

H0:   1.2F. The standard deviation of the oven temperature has not increased above 1.2°F. H1:   1.2F. The standard deviation of the oven temperature has increased above 1.2°F. 2 Decision rule: df = 29. If  STAT > 42.557, reject H0.

29  2.12 = 88.813 2 1.2 2 2 Decision: Since the test statistic of  STAT = 88.813 is greater than the critical boundary of 42.557, reject H0. There is sufficient evidence to conclude that the standard deviation of the oven temperature has increased above 1.2°F. You must assume that the data in the population are normally distributed to be able to use the chi-square test of a population variance or standard deviation. p-value = 5.53 × 10–8 or 0.00000005. The probability that a sample is obtained whose standard deviation is equal to or larger than 2.1°F when the null hypothesis is true is 5.53 × 10–8, a very small probability. Note: The p-value was found using Excel. 2 Test statistic:  STAT 

(b) (c)

12.75

(a)

(c)

12.76

(a)

H0:  = $200. The standard deviation of the amount of auto repairs is equal to $200. H1:   $200. The standard deviation of the amount of auto repairs is not equal to $200. 2 2 Decision rule: df = 24. If  STAT < 12.401 or  STAT > 39.364, reject H0.

24  237.52 2 = 33.849 2 200 2 2 Decision: Since the test statistic of  STAT = 33.849 is between the critical boundaries of 12.401 and 39.364, do not reject H0. There is insufficient evidence to conclude that the standard deviation of the amount of auto repairs is not equal to $200. You must assume that the data in the population are normally distributed to be able to use the chi-square test of a population variance or standard deviation. p-value = 2(0.0874) = 0.1748. The probability of obtaining a sample whose standard deviation will give rise to a test statistic equal to or more extreme than 33.849 is 0.1748 when the null hypothesis is true. Note: The p-value was found using Excel. 2 Test statistic:  STAT 

(b)

(n – 1)  S 2

(n –1)  S 2

H0:  = 12. H1:   12. 2 2 Decision rule: df = 14. If  STAT < 6.571 or  STAT > 23.685, reject H0. 2  Test statistic:  STAT

(n – 1)  S 2

2

14  9.252 = 8.319 12 2

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxlvii

(b)

2 Decision: Since the test statistic of  STAT = 8.319 is between the critical boundaries of 6.571 and 23.685, do not reject H0. There is insufficient evidence that the population standard deviation is different from 12. You must assume that the data in the population are normally distributed to be able to use the chi-square test of a population variance or standard deviation.

Copyright ©2024 Pearson Education, Inc.


12.76 cont.

(c)

p-value = 2(1 – 0.8721) = 0.2558. The probability of obtaining a test statistic equal to or more extreme than the result obtained from this sample data is 0.2558 if the standard deviation is 12. 2 Note: Excel returns an upper-tail area of 0.8721 for  STAT = 8.319. But since the sample standard deviation is smaller than the hypothesized value, the amount of area in the lower tail is (1 – 0.8721). That value is doubled to accommodate the two-tail hypotheses.

12.77

(a)

H0:   0.035 inch. The standard deviation of the diameter of doorknobs is greater than or equal to 0.035 inch in the redesigned production process. H1:  < 0.035 inch. The standard deviation of the diameter of doorknobs is less than 0.035 inch in the redesigned production process. 2 Decision rule: df = 24. If  STAT < 13.848, reject H0.

24  0.0252 = 12.245 2 0.0352 2 Decision: Since the test statistic of  STAT = 12.245 is less than the critical boundary of 13.848, reject H0. There is sufficient evidence to conclude that the standard deviation of the diameter of doorknobs is less than 0.035 inch in the redesigned production process. You must assume that the data in the population are normally distributed to be able to use the chi-square test of a population variance or standard deviation. p-value = (1 – 0.9770) = 0.0230. The probability of obtaining a test statistic equal to or more extreme than the result obtained from this sample data is 0.0230 if the population standard deviation is indeed no less than 0.035 inch. 2 Test statistic:  STAT 

(b) (c)

12.78

(a) (b) (c) (d)

WL = 13, WU = 53 WL = 10, WU = 56 WL = 7, WU = 59 WL = 5, WU = 61

12.79

(a) (b) (c) (d)

WU = 53 WU = 56 WU = 59 WU = 61

12.80

(a) (b) (c) (d)

WL = 13 WL = 10 WL = 7 WL = 5

(n – 1)  S 2

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxlix 12.81 Observation

Di

abs(Di)

Sign of Di

R

signed R

R(+)

1

3.2

3.2

+

6

6

6

2

1.7

1.7

+

2.5

2.5

2.5

3

4.5

4.5

+

7

7

7

4

0

0

5

11.1

11.1

+

9

9

9

6

-0.8

0.8

-

1

-1

0

7

2.3

2.3

+

5

5

5

8

-2

2

-

4

-4

0

9

0

0

Discard

-

10

14.8

14.8

+

10

10

10

11

5.6

5.6

+

8

8

8

12

1.7

1.7

+

2.5

2.5

2.5

Discard

-

-

-

-

-

W = in 1Ri(  ) = 50 12.82

n = 10,  = 0.05, WL = 8, WU = 47

12.83

Since W = 50 > WU = 47, reject H 0 .

12.84

W   in 1 Ri   = 67.5

12.85

n = 12,  = 0.05, WU = 61

12.86

Since W = 67.5 > WU = 61, reject H 0 .

12.87

(a)

H0: MD = 0 H1: MD  0 where Populations: 1 = A, 2 = B n

n  9, W   Ri    = 2 i 1

(b)

Decision rule: Reject H0 if W < 3 or > 33. Since W = 2 is smaller than 3, reject H0. There is enough evidence of a difference in the median summated rating between brand A and Brand B. In Problem 10.22, you conclude that there is enough evidence of a difference in the mean summated ratings between the two brands. Here, you conclude that there is enough evidence of a difference in the median summated rating between brand A and Brand B. Copyright ©2024 Pearson Education, Inc.


12.88

(a)

H0: MD = 0 where Populations:1 = Internet2 = TV H1: MD  0where Di  X Internet  X TV Using Excel Internet TV Di abs(Di) Sign of Di

R

signed R

R+

65

60

5

5

+

2.57

2.57

2.57

73

66

7

7

+

14.5

14.5

14.5

61

53

8

8

+

17

17

17

65

60

5

5

+

2.57

2.57

2.57

71

66

5

5

+

2.57

2.57

2.57

62

65

-3

3

-

1.5

-1.5

64

59

5

5

+

2.57

2.57

2.57

61

56

5

5

+

2.57

2.57

2.57

64

64

0 Discard

+

74

68

6

6

+

13

13

13

56

53

3

3

+

1.5

1.5

1.5

73

65

8

8

+

17

17

17

71

59

12

12

+

19

19

19

53

49

4

4

+

4

4

4

65

60

5

5

+

2.57

2.57

2.57

70

66

4

4

+

4

4

4

71

64

7

7

+

14.5

14.5

14.5

71

63

8

8

+

17

17

17

65

61

4

4

+

4

4

4

61

56

5

5

+

2.57

2.57

2.57

n

n  19, W   Ri    = 143.5 i 1

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cli Decision rule: Reject H0 if W < 46 or > 144. Since W = 143.5 is smaller than 144, do not reject H0. There is insufficient evidence of a difference in the median service rating between internet and tv service. 19(20) 143.5  95 19(20)(39)  95  W   1.95  24.85 Z STAT  4 24.85 24 Since ZSTAT < 1.96, do not reject H0.

W 

(b)

Using the paired-sample t-test in Problem 10.21, you reject the null hypothesis; you conclude that there is evidence of a difference in the mean service rating between internet and TV. Using the Wilcoxon signed rank test, you do not reject the null hypothesis; you conclude that there is not enough evidence of a difference in the median service rating between TV and phone services.

Copyright ©2024 Pearson Education, Inc.


12.89

(a)

H0: MD = 0 where Populations:1 = Restaurant2 = McDonald’s H1: MD  0where Di  X Restaurant  X McDonalds Using Excel Restaurant McDonalds Di abs(Di) Sign of Di R

signed R

R+

2.70

4.05

-1.35

1.35

-

5

-5

2.07

5.92

-3.85

3.85

-

14

-14

4.64

5.41

-0.77

0.77

-

4

-4

11.60

9.28

2.32

2.32

+

10.5

10.5 10.5

7.95

5.63

2.32

2.32

+

10.5

10.5 10.5

7.68

6.94

0.74

0.74

+

3

3

3

9.53

7.62

1.91

1.91

+

9

9

9

7.71

5.14

2.57

2.57

+

12

12

12

4.51

3.95

0.56

0.56

+

2

2

2

10.06

4.02

6.04

6.04

+

17

17

17

1.70

7.78

-6.08

6.08

-

18

-18

-18

3.15

4.84

-1.69

1.69

-

7

-7

20.32

8.13

12.19

12.19

+

25

25

25

14.54

8.73

5.81

5.81

+

16

16

16

7.33

5.87

1.46

1.46

+

6

6

6

10.99

4.81

6.18

6.18

+

19

19

19

4.52

6.33

-1.81

1.81

-

8

-8

20.00

9.00

11.00

11

+

24

24

24

17.39

10.44

6.95

6.95

+

21

21

21

17.39

9.28

8.11

8.11

+

22

22

22

5.59

5.96

-0.37

0.37

-

1

-1

13.71

9.14

4.57

4.57

+

15

15

Copyright ©2024 Pearson Education, Inc.

15


Solutions to End-of-Section and Chapter Review Problems cliii 15.85

9.23

6.62

6.62

+

20

20

20

8.86

5.57

3.29

3.29

+

13

13

13

26.87

16.12

10.75

10.75

+

23

23

23

n

n  25, W   Ri    = 250 i 1

25(26) 250  162.5 25(26)(51)  162.5  W   2.35  37.1652 Z STAT  4 37.1652 24 Since ZSTAT > 1.96, reject H0.

W 

Decision rule: Reject H0 if ZSTAT < –1.96 or > 1.96. Since ZSTAT = 2.35 > 1.96, reject H0. There is evidence of a difference in the median service rating between internet and tv service. (b)

12.90

(a)

Using the paired-sample t-test in Problem 10.22, you reject the null hypothesis; you conclude that there is evidence that the mean meal cost is higher at an inexpensive restaurant than at McDonald’s. Using the Wilcoxon signed rank test, you reject the null hypothesis; you conclude that there is enough evidence that the median price is different at an inexpensive restaurant than at McDonald’s. H0: MD = 0 where Populations:1 = Coffeepot2 = K-Cup H1: MD  0where Di  X Coffeepot  X K Cup Using Excel Coffeepot K-Cup

Di

22

23

-1

1

-

3.5

-3.5

24

21

3

3

+

10

10

10

23

22

1

1

+

3.5

3.5

3.5

25

24

1

1

+

3.5

3.5

3.5

20

22

-2

2

-

8

-8

19

20

-1

1

-

3.5

-3.5

24

25

-1

1

-

3.5

-3.5

25

26

-1

1

-

3.5

-3.5

20

18

2

2

+

8

8

19

21

-2

2

-

8

-8

abs(Di) Sign of Di

R

n

n  10 W   Ri    = 25 , i 1 Copyright ©2024 Pearson Education, Inc.

signed R

R+

8


Decision rule: Reject H0 if W < 8 or > 47. Since W = 25 is smaller than 47, do not reject H0. There is insufficient evidence of a difference in the median overall scores between coffeepot-brewed and K-cup-brewed coffee. 10(11) 25  27.5 10(11)(23) W   27.5  W   0.2548  9.8107 Z STAT  4 9.8107 24 Since ZSTAT = –0.2548 < 1.96, do not reject H0.

12.91

(b)

Using the paired-sample t-test in Problem 10.23, you do not reject the null hypothesis; you conclude that there is insufficient evidence of a difference in the mean scores of coffeepot-brewed and K-cup-brewed coffee. Using the Wilcoxon signed rank test, you do not reject the null hypothesis; you conclude that there is insufficient evidence of a difference in the median overall scores between coffeepot-brewed and K-cup-brewed coffee.

(a)

H0: MD  0where Populations:1 = two days2 = seven days H1: MD < 0 Minitab Output: Wilcoxon Signed Rank Test: Differences

Test of median = 0.000000 versus median < 0.000000

(b)

Since the p-value is smaller than the Estimated 0.01 level of significance, reject H 0 . There N for = 0.000 Wilcoxon is sufficient evidence that the median strength is less at two days than at seven days. N Test Statistic P Using the paired-sample t-test in Problem 10.26, you reject theMedian null hypothesis and conclude that there is enough evidence that the mean strength is less at two days than at Differen 40 40 0.0 0.000 -0.5100 seven days. Using the Wilcoxon signed rank test, you reject the null hypothesis and conclude that there is enough evidence that the median strength is less at two days than at seven days.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems clv 12.92

d.f. = 5,  = 0.1, U2  9.2363

12.93

(a) (b)

12.94

H0: M1 = M2 = M3 = M4 = M5 = M6H1: At least one of the medians differs. Reject H0 if FR > 9.2363. Since FR = 11.56 > 9.2363, reject H0. There is enough evidence that the medians are different.

Minitab output: Friedman Test: Rating versus Brand, Expert

Friedman test for Rating by Brand blocked by Expert

S = 20.03

DF = 3

P = 0.000

S = 20.72

DF = 3

P = 0.000 (adjusted for ties)

Est

Sum of

Brand

N

Median

Ranks

A

9

25.000

25.0

B

9

26.750

34.5

C

9

24.000

20.0

D

9

22.250

10.5

Grand median

=

24.500

(a)

(b)

H 0 : M A  M B  M C  M D H1 : Not all medians are the equal. Since the p-value is virtually zero, reject H0 at 0.05 level of significance. There is evidence of a difference in the median summated ratings of the four brands of Colombian coffee. In (a), you conclude that there is evidence of a difference in the median summated ratings of the four brands of Colombian coffee while in problem 11.23, you conclude that there is evidence of a difference in the mean summated ratings of the four brands of Colombian coffee.

Copyright ©2024 Pearson Education, Inc.


12.95

(a)

H 0 : M A  M B  M C H1 : Not all medians are the equal. From Excel

Company

Internet

Rank

TV

Rank

AT&T

65

2

60

1

Armstrong

73

2

66

1

Atlantic Broadband

61

2

53

1

Charter(Spectrum)

65

2

60

1

Cincinatti Bell

71

2

66

1

Consolidated Communications

62

1

65

2

Cox

64

2

59

1

Frontier

61

2

56

1

Lumen (Century Link)

64

1.5

64

1.5

Midco

74

2

68

1

Optimum

56

2

53

1

Astound (RCN)

73

2

65

1

Sparklight

71

2

59

1

SuddenLink

53

2

49

1

TDS

65

2

60

1

Verizon

70

2

66

1

Wave

71

2

64

1

WOW!

71

2

63

1

xfinity (Comcast)

65

2

61

1

Xtream (Mediacom)

61

2

56

1

Rank Totals

38.5

21.5

Rank Totals Squared

1482.3

462.25

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems clvii r = 20, c = 2 Sum of Rankings = 38.5 + 21.5 = 60 rc(c  1) 20(2)(3)   60 Check the Rankings = 2 2 12 38.5  21.5  3(20)(2)  14.45 FR  20(2)(3) d.f. = c – 1 = 1. For α = 0.05, Critical value = 3.841 Since FR = 14.42 > 3.841, reject H0. There is evidence of a difference in the median service rating between Internet and TV Service. (b)

In (a), you conclude that there is evidence of a difference in the median service rating between Internet and TV Service while in problem 11.52, you conclude that there is evidence of a difference in the mean rating between Internet and TV Service.

Copyright ©2024 Pearson Education, Inc.


12.96

Minitab output:

(a)

(b)

12.97

H 0 : M A  M B  M C  M D H1 : Not all medians are the equal. Since the p-value is virtually zero, reject H0 at 0.05 level of significance. There is evidence of a difference in the median prices for these items at the four supermarkets. In (a), you conclude that there is evidence of a difference in the median prices for these items at the four supermarkets while in problem 11.25, you conclude that there is evidence of a difference between the mean price of these items at the four supermarkets.

Minitab output: Friedman Test: Thickness versus Position, Batch1

Friedman test for Thickness by Position blocked by Batch1

S = 97.97

DF = 4

P = 0.000

S = 99.63

DF = 4

P = 0.000 (adjusted for ties)

Est

Sum of

Position

N

Median

Ranks

1

30

240.45

32.0

2

30

242.55

64.0

18

30

245.25

97.5

19

30

249.15

141.0

28

30

246.85

115.5

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems clix Grand median

(a)

(b)

=

244.85

H 0 : M1  M 2  M18  M19  M 28 H1 : Not all medians are equal. Since the p-value is virtually zero, reject H0 at 0.05 level of significance. There is evidence of a difference in the median thickness of the wafers for the five positions. In (a), you conclude that there is evidence of a difference in the median thickness of the wafers for the five positions, and in problem 11.27, you conclude that there is evidence of a difference in the mean thickness of the wafers for the five positions.

Copyright ©2024 Pearson Education, Inc.


12.98

Minitab output: Friedman Test: Strength versus Days, Samples

Friedman test for Strength by Days blocked by Samples

S = 80.00

DF = 2

P = 0.000

Est

Sum of

Days

N

Median

Ranks

2

40

3.0863

40.0

7

40

3.5888

80.0

28

40

4.5838

120.0

Grand median

=

3.7529

(a)

(b)

H 0 : M 2  M 7  M 28 H1 : Not all medians are equal. Since the p-value is virtually zero, reject H0 at 0.05 level of significance. There is evidence of a difference in the median compressive strength after 2, 7 and 28 days. In (a), you conclude that there is evidence of a difference in the median compressive strength after 2, 7 and 28 days, and in problem 11.28, you conclude that there is evidence of a difference in the mean compressive strength after 2, 7 and 28 days.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems clxi Chapter 16

16.65

The price of the commodity in 2018 was 75% higher than in 1995.

16.66

(a)

(b)

16.67

2016 as the base year: P $5 I 2016  2016 100   100  = 100 P2016 $5 P $8 I 2017  2017 100   100  = 160 P2016 $5 P $7 I 2018  2018 100   100  = 140 P2016 $5 2017 as the base year: P $5 I 2016  2016 100   100  = 62.5 P2017 $8 P $8 I 2017  2017 100   100  = 100 P2017 $8 P $7 I 2018  2018 100   100  = 87.5 P2017 $8

(a)

IU2018 

43 3i 1 Pi 2018 100  = 100  = 186.96 3 1995  23 i 1 Pi

(b)

I L2018 

240 3i 1 Pi 2018Qi1995 100  = 100  = 162.16 3 1995 1995  148 i 1 Pi Qi

(c)

I P2018 

227 3i 1 Pi 2018Qi2018 100  = 100  = 154.42 3 1995 2018  147 i 1 Pi Qi

Copyright ©2024 Pearson Education, Inc.


16.68

(a),(b)

(c)

The price index using 2000 as the base year is more useful because it is closer to the present and the DJIA has grown more than 200% over the period. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems clxiii 16.69

(a), (c) For MLB Salaries, (mean salary of major league baseball players on opening day)

(b) (d) (e) (f)

The MLB salary in 2022 is 86.08% higher than it was in 2003. The MLB salary in 2022 is 37.38% higher than it was in 2012. Using 2012 as the base year is more useful because it is closer to the present.

There is a upward trend in MLB salaries from 2004 to 2017, followed by leveling off from 2017 to 2020, with a dip at 2021. Copyright ©2024 Pearson Education, Inc.


16.70

(a), (c)

(b) (d)

The average price per pound of fresh tomatoes in 2018 in the U.S. is 210.10% higher than it was in 1980. The average price per pound of fresh tomatoes in 2014 in the U.S. is 25.65% higher than it was in 1990.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems clxv 16.70 cont.

(e)

There is an upward trend in the cost of fresh tomatoes from 1980 to 2018 with a prominent cyclical component.

Copyright ©2024 Pearson Education, Inc.


16.71

(a) Year

Electricity Price Index (base=1992)

Natural Gas Price Index (base=1992)

Fuel Oil Price Index (base=1992)

1992

100

100

100

1993

38.10925597

108.9968153

98.37563452

1994

108.3121728

114.6345162

93.29949239

1995

109.8267455

113.2544738

92.69035533

1996

109.0717063

112.1094935

102.2335025

1997

110.6604346

124.7497725

115.3299492

1998

104.269567

119.1916894

98.07106599

1999

101.2583987

116.3898999

84.67005076

2000

101.5864812

120.048529

120.7106599

2001

106.6762545

188.5577798

153.1979695

2002

107.5661221

137.6857749

114.0101523

2003

107.1054583

152.5136488

141.7258883

2004

110.4671805

174.5829542

153.0964467

2005

114.2603537

193.0997877

188.7309645

2006

128.5881216

251.7515924

245.4822335

2007

135.5992

213.4402487

240.4060914

2008

144.203501

205.4898392

338.7817259

2009

141.5698524

167.5765848

254.7208122

2010

139.3227118

163.4819533

301.2182741

2011

140.4462821

156.8092205

346.7005076

2012

133.4464394

149.4388838

375.3299492

2013

144.9405631

151.0464058

389.9492386

2014

150.5584144

155.6718229

396.3451777

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems clxvii 2015

155.0526954

157.1125265

285.3807107

2016

150.5584144

136.1844101

200

2017

150.5584144

151.6530179

252.1827411

2018

152.8055549

158.9323628

294.6192893

Copyright ©2024 Pearson Education, Inc.


16.71 cont.

(b)

Year

Electricity Price Index (base=1996)

Natural Gas Price Index (base=1996)

Fuel Oil Price Index (base=1996)

1992

91.68280522

89.19851201

97.81529295

1993

34.93963493

97.22353737

96.22641509

1994

99.30363839

102.2522827

91.2611718

1995

100.6922411

101.0213054

90.6653426

1996

100

100

100

1997

101.4565907

111.2749408

112.8103277

1998

95.597264

106.3172134

95.9285005

1999

92.83654044

103.8180588

82.82025819

2000

93.1373357

107.0815015

118.0734856

2001

97.8037826

168.1907339

149.8510427

2002

98.61963822

122.8136625

111.5193644

2003

98.19728872

136.0399053

138.6295929

2004

101.2794099

155.7253974

149.7517378

2005

104.7570975

172.2421373

184.6077458

2006

117.8931971

224.5586743

240.1191658

2007

124.3211504

190.3855259

235.1539225

2008

132.209815

183.2938789

331.3803376

2009

129.795212

149.4758201

249.1559086

2010

127.7349705

145.8234697

294.6375372

2011

128.7650913

139.8714914

339.1261172

2012

122.3474391

133.2972607

367.1300894

2013

132.8855742

134.7311464

381.4299901

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems clxix 2014

138.0361778

138.8569496

387.6861966

2015

142.1566608

140.1420358

279.1459782

2016

138.0361778

121.4744674

195.6305859

2017

138.0361778

135.2722354

246.673287

2018

140.0964193

141.7653027

288.182721

Copyright ©2024 Pearson Education, Inc.


16.71

(c)

For 2018: IU2018 

68.000  41.920  2.902 100   156.998 , using 1992 as base period. 44.501  26.376  0.985

cont. Year

Electricity

Natural Gas

Fuel Oil

Unweighted

1992

44.501

26.376

0.985

100.000

1993

16.959

28.749

0.969

64.954

1994

48.200

30.236

0.919

110.427

1995

48.874

29.872

0.913

110.850

1996

48.538

29.570

1.007

110.093

1997

49.245

32.904

1.136

115.896

1998

46.401

31.438

0.966

109.662

1999

45.061

30.699

0.834

106.585

2000

45.207

31.664

1.189

108.625

2001

47.472

49.734

1.509

137.367

2002

47.868

36.316

1.123

118.709

2003

47.663

40.227

1.396

124.246

2004

49.159

46.048

1.508

134.584

2005

50.847

50.932

1.859

144.218

2006

57.223

66.402

2.418

175.396

2007

60.343

56.297

2.368

165.606

2008

64.172

54.200

3.337

169.365

2009

63.000

44.200

2.509

152.666

2010

62.000

43.120

2.967

150.409

2011

62.500

41.360

3.415

149.279

2012

59.385

39.416

3.697

142.632

2013

64.500

39.840

3.841

150.540

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems clxxi

(d)

2014

67.000

41.060

3.904

155.804

2015

69.000

41.440

2.811

157.595

2016

67.000

35.920

1.97

145.960

2017

67.000

40.000

2.484

152.353

2018

68.000

41.920

2.902

156.998

Base year = 1992: 3 P 2018Q1992 I L2018  i31 i1992 i1992 100  i 1 Pi Qi =

(e)

 68.00010   41.920  24   2.902 400 100 = 193.3977    44.50110   26.376 24   0.985 400

6,500 kWh = 13 units; 1040 therms = 26 units; 235 gal = 235 units Base year = 1992: 3 P 2018Q1992 I L2018  i31 i1992 i1992 100  i 1 Pi Qi =

 68.00013   41.920 26   2.902 235 100 = 177.5608    44.50113   26.376 26   0.985 235

Instructional Tips and Solutions for Digital Cases

Chapter 2

Instructional Tips 1. Students should develop a frequency distribution of the More Winners data along with at least one graph such as a histogram, polygon, or cumulative percentage polygon. 2. One objective is to have students look beyond the actual statistical results generated to evaluate the claims presented. For the More Winners data, this might include a comparison with tables and charts developed for the entire Mutual Funds data set. Such a comparison would lead to the realization that all eight funds in the ―Big Eight‖ are high-risk funds that may have a great deal of variation in their return. 3. The presentation of information can lead to different perceptions of a business. This can be seen in the aggressive approach taken in the home page. Solutions 1. Yes. There is a breathless, exaggerated style to the writing and the illustrations are very busy and colorful without conveying much information. There is also a certain aggressiveness in Copyright ©2024 Pearson Education, Inc.


exclamations to ―show me the data.‖ Claims are made, but supporting evidence is scant. The style is reminiscent of a misleading infomercial. The graphs on pages 5 and 6 have poor design that obscures their meaning, if any. Also, nowhere in the document does EndRun disclose its principals and the address of its operations, something that a reputable business would surely do. And a testimonial page at the end is more suitable for an infomercial selling a consumer product and not something one would expect to see from a reputable financial services firm. 2. Frequencies (Return(%)) Bins Frequency Percentage –50 0 0.00% –40 1 3.45% –30 3 10.34% –20 4 13.79% –10 2 6.90% –0.01 1 3.45% 9.99 9 31.03% 20 3 10.34% 30 2 6.90% 40 3 10.34% 50 1 3.45%

Cumulative % .00% 3.45% 13.79% 27.59% 34.48% 37.93% 68.97% 79.31% 86.21% 96.55% 100.00%

Copyright ©2024 Pearson Education, Inc.

Midpts --–45 –35 –25 –15 –5 5 15 25 35 45


Solutions to End-of-Section and Chapter Review Problems clxxiii 2. cont.

3.

4.

Although the claim is literally true, the data show a wide range of returns for the 29 mutual funds selected by EndRun investors. Although 18 funds had positive returns, 11 had negative returns for the five-year period. Of the funds having negative returns, many had large losses, with 27.59% having annualized losses of 20% or more. Many of the positive returns were small, with 31.03% having an annualized return between 0 and 10%. All of this raise questions about the effectiveness of the EndRun investment service. Since mutual funds are rated by risk, it would be important to know the ―risk‖ of the funds EndRun chooses. ―High‖ risk funds, as all eight turn out to be, are not a wise choice for certain types of investors. An in-depth analysis would also see if the eight funds were representative of the performance of that group (no, the eight are among the weakest performers, as it turns out). In addition, examining summary measures (discussed in Chapter 3) would also be helpful in evaluating the ―Big Eight‖ funds. You would hope that one’s investment ―grew‖ over time. Whether this is reason to be truly proud would again be based on a comparison to a similar group of funds. You would also like to know such things as whether the gain in value is greater than any inflation that might have occurred during that period. Even more sophisticated reasoning would look at financial planning analysis to see if an investment in the ―big eight‖ was a worthy one or one that showed a real gain after tax considerations. A warning flag, however, is that the business feels the need to state that it is ―proud‖ even as it does not state a comparative (such as ―we are proud to have outperformed all of the leading national investment services.‖) Such an emotional claim suggests a lack of rational data that could otherwise be used to make a more persuasive case for using EndRun’s service.

Copyright ©2024 Pearson Education, Inc.


Chapter 3

Instructional Tips 1. Students should compute descriptive statistics and develop a boxplot for the More Winners sample. They should compare the measures of central tendency and take note of the measures of variation. The boxplot can be used to evaluate the symmetry of the data. 2. All too often means and standard deviations are computed on data from a scale (usually 5 or 7 points) that is ordinal at best. They should be cautioned that such statistics are of questionable value. Solutions 1. Return(%) Mean

–0.61724

Standard Error

4.533863

Median

1.1

Mode

1.1

Standard Deviation

24.4156

Sample Variance

596.1215

Range

85

Minimum

–41.9

Maximum

43.1

Sum

–17.9

Count

29

Largest(1)

43.1

Smallest(1)

–41.9

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems clxxv Returns(%) for More Winners

Return(%)

-50

-40

-30

-20

-10

0

10

20

30

40

1. For the sample of 29 investors, the average annualized rate of return is –0.62% and the median cont. annualized rate of return is only 1.1%, Thus, half the investors are either losing money or have a very small return. In addition, there is a very large amount of variability with a standard deviation of over 24% in the annualized return. The data appear fairly symmetric since the distance between the minimum return and the median is about the same as the distance between the median and the largest return. However, the first quartile is more distant from the median than is the third quartile. 2. Calculating mean responses for a categorical variable is a naïve error at best. No methodology for collecting this survey is offered. For several questions, the neutral response dominates, surely not an enthusiastic endorsement of EndRun! Strangely, for the question ―How satisfied do you expect to be when using EndRun's services in the coming year?‖ only 19 responses appear, compared with 26 or 27 responses for the other questions (see the next question). Eliminating the means and considering the questions as categorical variables and then developing a bar chart for each question would be more appropriate. 3. As proposed, the question expects that the person being surveyed will be using EndRun. Most likely, the missing responses reflect persons who had already planned not to use EndRun and therefore could not answer the question as posed. Survey questions that would uncover reasons for planning to use or not use would be more insightful.

Copyright ©2024 Pearson Education, Inc.


Chapter 4

Instructional Tips The main goal of the Digital case for this chapter is to have students be able to distinguish between what is a simple probability, a joint probability, and a conditional probability. Solutions 1. Best 10 Customers

Return not less than 20% 8

Return less than 20% 2

Other Customers

0

19

The claim ―four-out-of-five chance of getting annualized rates of return of no less than 20%,‖ is literally accurate, but it applies only to EndRun’s best 10 customers. A more accurate probability would consider all customers (8/25, or about 32%). In fact, none of EndRun’s other customers achieved a return of not less than 20%. Another issue is that you do not know the actual return rates for each customer, so you cannot calculate any meaningful descriptive statistics. 2.

Made money?

3.

Yes No

Invested at EndRun Yes No 15 98 10 41

The 6% probability calculated (10/164 = 6.10%) is actually the joint probability of investing at EndRun and making money. The probability of being an EndRun investor who lost money is the conditional probability of losing money given an investment in EndRun which is equal to 10/25 = 40%. Since the patterns of security markets are somewhat unpredictable by their nature, any probabilities based on past performance are not necessarily indicative of future events. Even if EndRun had the ―best‖ probability for ―success‖, that would be no guarantee that their investment strategy would work in tomorrow’s market.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems clxxvii

Chapter 5

Instructional Tips This digital case involves computing expected values and standard deviations of probability distributions and then using portfolio risk to obtain a good expected return with a lower risk than what would be involved if an entire investment was made in one fund. 1. Students need to realize that a very good return may occur only under certain circumstances. 2. Students need to realize that how the probabilities of the various events are obtained is of crucial importance to the results. 3. Using PHStat2, students can determine the expected portfolio return and portfolio risk of different combinations of two different funds. Solutions 1. Yes! ―With EndRun's Worried Bear Fund, you can get a four hundred percent rate of return in times of recession!‖ However, EndRun itself estimates the probability of recession at only 20% in its own calculations. ―With EndRun's Happy Bull Fund, you can make twelve times your initial investment (that's a 1200 percent rate of return!) in a fast expanding, booming economy.‖ In this case, EndRun itself estimates the probability of a fast-expanding economy at only 10%. 2. Estimating the probabilities of the outcomes is very subjective. It is never made clear how the value of the outcomes was determined. 3. There are several factors to consider. Most obviously, if an investor believed in a different set of probabilities, then the Worried Bear fund would not necessarily have the better expected return. An investor more concerned about risk would want to examine other measures (such as the standard deviation of each investment, the expected portfolio return, and the portfolio risk of different combination of investments). Investors who hedge might also invest in a lower expected return fund if the pattern of outcomes is radically different (as it is in the case of the two EndRun funds).

Copyright ©2024 Pearson Education, Inc.


3. cont. EndRun Portfolio Analysis

Outcomes

P

Happy Bull

Worried Bear

fast expanding economy

0.1

1200

–300

expanding economy

0.2

600

–200

weak economy

0.5

–100

100

recession

0.2

–900

400

Weight Assigned to X

0.5

Statistics E(X)

10

E(Y)

60

Variance(X)

382900

Standard Deviation(X)

618.7891

Variance(Y)

50400

Standard Deviation(Y)

224.4994

Covariance(XY)

–137600

Variance(X+Y)

158100

Standard Deviation(X+Y)

397.6179

Portfolio Management Weight Assigned to X

0.5

Weight Assigned to Y

0.5

Portfolio Expected Return

35

Portfolio Risk

198.809

Portfolio Management Weight Assigned to X

0.3

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems clxxix Weight Assigned to Y

0.7

Portfolio Expected Return

45

Portfolio Risk

36.94591

Portfolio Management Weight Assigned to X

0.2

Weight Assigned to Y

0.8

Portfolio Expected Return

50

Portfolio Risk

59.4979

Portfolio Management Weight Assigned to X

0.1

Weight Assigned to Y

0.9

Portfolio Expected Return

55

Portfolio Risk

141.0142

Portfolio Management Weight Assigned to X

0.7

Weight Assigned to Y

0.3

Portfolio Expected Return

25

Portfolio Risk

366.5583

3. cont. Portfolio Management Weight Assigned to X

0.9

Weight Assigned to Y

0.1

Portfolio Expected Return

15

Portfolio Risk

534.6821

Note that of the two funds, Worried Bear has both a higher expected return and a lower standard deviation. From the results above, it appears that a good approach is to invest more in the Worried Bear fund than the Happy Bull fund to achieve a higher expected portfolio return while minimizing the risk. A reasonable choice is to invest 30% in the Happy Bull fund and 70% in the Worried Bear fund to achieve an expected portfolio return of 45 with a portfolio risk of 36.94. This risk is Copyright ©2024 Pearson Education, Inc.


substantially below the standard deviation of 618 for the Happy Bull fund and 224 for the Worried Bear fund. The expected portfolio return of 45 is much higher than the expected return for investing in only the Happy Bull fund and is somewhat below the expected return for investing completely in the Worried Bear fund. Of course, with the knowledge about EndRun accumulated through Digital cases in Chapters 2–5, a reasonable course of action would be not to invest any money with EndRun!

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems clxxxi

Chapter 6

Instructional Tips This digital case consists of two parts – determining whether the download times are approximately normally distributed and then evaluating the validity of various statements made concerning the download times that relate to understanding the meaning of probabilities from the normal distribution. Solution 1. Statistics Sample Size

300

Mean

13.3472

Median

13.535

Std. Deviation

3.137250

Minimum

5.15

Maximum

21.31

Copyright ©2024 Pearson Education, Inc.


From the normal probability plot, the data appear to be approximately normally distributed. In addition, the distance from the minimum value to the median is approximately the same as the distance from the median to the maximum value. 2.

―Standard deviation of 3.1272‖ This is false because the standard deviation is 3.137250.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems clxxxiii 2. ―• One time out of every 10 times, an individual user will experience a download time that is cont. greater than 17.37 seconds.‖ The probability of a download time above 17.37 seconds is 9.3%. A probability of 17.37 or greater is 9.67%. However, this does not mean that one of every ten downloads will take more than 17.37 seconds. It means that if the data is normally distributed with  = 13.3472 seconds and the standard deviation equal to 3.137250 seconds, 10% of all downloads will take more than 17.37 seconds. ―• An 18-second download time has a probability of only 0.069.‖ This is false since the probability of an exact download time is zero. Statements should be made concerning the likelihood that the download time is less than a specific value. For example, the probability of a download time less than 18 seconds is 0.9316 or 93.16%. Normal Probabilities

Common Data Mean

13.3472

Standard Deviation

3.1272

Probability for X <= X Value

18

Z Value

1.4878486

P(X<=18)

0.9316

―• Because all the download times fall within plus or minus 3 standard deviations, the movie download process meets the Six Sigma benchmark for industrial quality. (Recall that senior management held a meeting last month on the importance of the Six Sigma methodology.)‖ Note: Six Sigma is discussed in online Chapter 19 of the text. This statement is ―double talk‖. In a normal distribution, 99.7% percent of all measurements fall within plus or minus 3 standard deviations. Six Sigma is a managerial approach designed to create processes that results in no more than 3.4 defects per million. The QRT needs to determine the requirements of the customers and then determine the capability of the current process (see Section 19.6) before embarking on quality improvement efforts. 3.

If the standard deviation is lowered to 2 seconds keeping the mean at 13.3472 seconds, the probability of a download taking less than 18 seconds would change from 0.9316 to 0.9900. Normal Probabilities

Common Data Mean

13.3472 Copyright ©2024 Pearson Education, Inc.


Standard Deviation

2

Probability for X <= X Value

18

Z Value

2.3264

P(X<=18)

0.9900

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems clxxxv 4.

If the standard deviation was assumed to be the same as it was previously at 3.1272, the probability of obtaining a download time below a specific number of seconds would increase. For example, the probability of having a download time below 18 seconds with a mean of 11.3472 seconds instead of a mean of 13.3472 seconds is 98.33% instead of 93.16%. Normal Probabilities

Common Data Mean

11.3472

Standard Deviation

3.1272

Probability for X <= X Value

18

Z Value

2.1274

P(X<=18)

0.9833

Copyright ©2024 Pearson Education, Inc.


Chapter 7

Instructional Tips This digital case focuses on two concepts – the need for random sampling and the application of the sampling distribution of the mean. Solutions 1. ―For our investigation, members of our group went to their favorite stores …One member thought her box of Oxford’s Pennsylvania Dutch-Style Chocolate Brownie Morning Squares was short, but her son opened the box and starting eating its contents before we could weigh the box...‖ These comments suggest that a non-random, informal collection procedure was used. When the data are examined, you discover that the sample size is only 5 for each of the two snacks. Drawing a random sample, and using a larger sample size would add rigor by reducing the variability in the sample means. 2. (a)

(b)

Oxford Cheez Squares

Alpine Granola Frosted Pretzels

360.4

366.1

361.8

367.2

362.3

365.6

364.2

367.8

371.4

373.5

364.02

368.04

If   15, then  X 

15  6.7082 , and with an expected population mean of 368 grams, 5

Normal Probabilities Box Weight for Oxford Cheez Squares Mean

368

Standard Deviation

6.7082

Probability for X <= X Value

364.02

Z Value

–0.593304

P(X<=364.02)

0.2764889

The likelihood of obtaining a sample average weight of no more than 364.02 grams if the population weight is 368 grams is 27.65%. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems clxxxvii Normal Probabilities Box Weights for Alpine Granola Frosted Pretzels Mean

368

Standard Deviation

6.7082

Probability for X <= X Value

368.04

Z Value

0.005962851

P(X<=368.04)

0.502378833

The likelihood of obtaining a sample average weight of no more than 368.04 grams if the population weight is 368 grams is 50.24%.

Copyright ©2024 Pearson Education, Inc.


2. (c) cont. Normal Probabilities Box Weight for Oxford Cheez Squares Mean

368

Standard Deviation

15

Probability for X <= X Value

364.02

Z Value

–0.265333

P(X<=364.02)

0.3953764

The likelihood of obtaining an individual weight of no more than 364.02 grams if the population weight is 368 grams is 39.54%. Normal Probabilities Box Weights for Alpine Granola Frosted Pretzels Mean

368

Standard Deviation

15

Probability for X <=

3.

4.

X Value

368.04

Z Value

0.002666667

P(X<=368.04)

0.501063851

The likelihood of obtaining an individual weight of no more than 368.04 grams if the population weight is 368 grams is 50.11%. There is a fairly high chance that an individual box of Oxford Cheez Squares or the mean of a sample of five boxes will have a weight below 364.02 grams. There is more than a 50% chance that an individual box of Alpine Granola Frosted Pretzels or the average of a sample of five boxes will have a weight below 368.04 grams. This is true even though four of the five boxes in each sample contain less than 368 grams. Arguments for being reasonable:  Statistical procedure used is invalid.  The mean of the one group actually exceeds 368.  Confusion over conclusions that can be drawn from a sample.  Possibility of investigator bias. Arguments against:  Data are available for independent review.  Oxford is producing some boxes of snacks that had less cereal than claimed on their boxes.  Right of individuals to freely express non-libelous opinions. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems clxxxix 5.

Even for the Oxford Cheez Squares sample, you cannot prove cheating without using statistical inference. When the techniques of the next two chapters are applied, it will turn out that with these samples, there is insufficient evidence that the population mean is less than 368 grams.

Copyright ©2024 Pearson Education, Inc.


Chapter 8

Instructional Tips This digital case focuses on two concepts – the need to develop confidence interval estimates rather than point estimates, and using statistical methods to determine sample size. Solutions 1.

Using PHStat Confidence Interval Estimate for the Proportion: Conbanco

Pay a Friend (PAF) Data

Data

Sample Size

200

Sample Size

200

Number of Successes

90

Number of Successes

110

Confidence Level

95%

Confidence Level

95%

Intermediate Calculations Sample Proportion

Intermediate Calculations 0.45

Sample Proportion

0.55

Z Value

-1.9600

Z Value

-1.9600

Standard Error of the Proportion

0.0352

Standard Error of the Proportion

0.0352

Interval Half Width

0.0689

Interval Half Width

0.0689

Confidence Interval

Confidence Interval

Interval Lower Limit

0.3811

Interval Lower Limit

0.4811

Interval Upper Limit

0.5189

Interval Upper Limit

0.6189

The confidence interval estimate for each of the two groups includes 0.50 or 50%. The proportion in the population using Pay A Friend (PAF) is estimated to be between 48.1% and 61.9%, while the proportion using Conbanco is estimated to be between 38.1% and 51.9%.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxci 2.Using PHStat Confidence Interval Estimate for the Mean

Conbanco

PAF Data

Data

Sample Standard Deviation

10.70568735

Sample Standard Deviation

7.479171836

Sample Mean

32.39

Sample Mean

25.25

Sample Size

90

Sample Size

110

Confidence Level

95%

Confidence Level

95%

Intermediate Calculations Standard Error of the Mean

Intermediate Calculations

1.128478532

Degrees of Freedom

89

Standard Error of the Mean

0.713111054

Degrees of Freedom

109

t Value

1.9870

t Value

1.9820

Interval Half Width

2.2423

Interval Half Width

1.4134

Confidence Interval

Confidence Interval

Interval Lower Limit

30.15

Interval Lower Limit

23.84

Interval Upper Limit

34.63

Interval Upper Limit

26.66

The 95% confidence interval estimate for the mean payment amount is $23.84 to $26.66 for Pay a Friend and $30.15 to $34.63 for Conbanco. Since the confidence intervals for both Pay a Friend and Conbanco include 0.50 or 50%, there is no evidence that customers use the two forms of payment in unequal numbers. Since there is some overlap in the two confidence intervals for the mean, it is hard to conclude that there is a difference in the mean purchases for the two forms of payment. However, these data are useful in pointing out the fact that when comparing differences between the means of two groups, confidence interval estimates for each group should not be compared. In fact, the correct procedure is to use the t-test for the difference between the means and the confidence interval estimate for the difference between two means (to be covered in Chapter 10). The results of this test indicate a significant difference in the mean purchase amount between the two forms of payment. Copyright ©2024 Pearson Education, Inc.


Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxciii 2. cont. Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference

0

Level of Significance

0.05

Population 1 Sample Sample Size

110

Sample Mean

25.25154545

Sample Standard Deviation

7.479171836

Population 2 Sample Sample Size

90

Sample Mean

32.38766667

Sample Standard Deviation

10.70568735

Intermediate Calculations Population 1 Sample Degrees of Freedom

109

Population 2 Sample Degrees of Freedom

89

Total Degrees of Freedom

198

Pooled Variance

82.3116

Standard Error

1.2895

Difference in Sample Means

-7.1361

t Test Statistic

-5.5339 Two-Tail Test Copyright ©2024 Pearson Education, Inc.


3.

Lower Critical Value

-1.9720

Upper Critical Value

1.9720

Using the range of the data divided by 6 as an estimate of the population standard deviation [(56.84 – 3.32)/6] equal to 8.92, the sample size necessary for 95% confidence with a sampling error of ± $3 is 34. Thus, a sample size of 200 is large enough. Sample Size Determination Data Population Standard Deviation

8.92

Sampling Error

3

Confidence Level

95%

Intemediate Calculations Z Value

-1.9600

Calculated Sample Size

33.9612

Result Sample Size Needed

34.0000

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxcv Chapter 9

Instructional Tips There are several objectives involved in this digital case. 1. Have students question the validity of data collected. 2. Have students looking for hidden issues that could invalidate a set of conclusions. 3. Have students use hypothesis testing to draw conclusions about a claimed value. 4. Increase students’ understanding of the effect of sampling on a conclusion. Solutions 1. Issues that could be raised about the testing process – the size of the sample, how the sample was selected, the selection of only two brands of cereals, the identity of the independent testers (not disclosed), whether, as discussed in subsequent chapters, there is a single sample or in fact, samples of two different snacks. Also, if you read all of the materials related to the television station, you could raise issues about the independence of the consumer reporter and wonder why only one out of four plants was chosen for this analysis. 2. t Test for Hypothesis of the Mean

Data Null Hypothesis

=

Level of Significance

368 0.05

Sample Size

80

Sample Mean

370.433375

Sample Standard Deviation

14.70776355

Intermediate Calculations Standard Error of the Mean

1.644377955

Degrees of Freedom

79

t Test Statistic

1.479814901

Lower-Tail Test Lower Critical Value

–1.664370757

p-Value

0.928550208

Do not reject the null hypothesis

Copyright ©2024 Pearson Education, Inc.


3.

The mean weight is actually above the hypothesized weight of 368 grams by 1.48 standard deviation units. Clearly, with a p-value of 0.929, there is no reason to believe that the mean weight is below 368 grams. However, as noted in the press release, samples of two different snacks were selected, so the question can be raised as to whether separate analyses should have been done on each snack. The claim is true since 42 boxes contain more than 368 grams. However, if the mean were equal to 368, you would expect that approximately half of the boxes would contain more than 368 grams, so the result is certainly not surprising. Of course, the Oxford CEO does not mention that 38 boxes contained less than 368 grams.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxcvii 4.

Sample statistics will vary from sample to sample. It is possible that a sample with a mean below 368 grams and a sample with a mean above 368 grams will both lead to the conclusion that there is insufficient evidence that the population mean is below 368 grams. In fact, if you use the CCACC sample of 10 snack boxes discussed in Chapter 7, the results of the test for whether the population mean is below 368 are not significant. t Test for Hypothesis of the Mean

Data Null Hypothesis

=

Level of Significance

368 0.05

Sample Size

10

Sample Mean

366.03

Sample Standard Deviation

4.165746565

Intermediate Calculations Standard Error of the Mean

1.31732473

Degrees of Freedom

9

t Test Statistic

-1.495455111

Lower-Tail Test Lower Critical Value

-1.833113856

p-Value

0.08450497

Do not reject the null hypothesis

Copyright ©2024 Pearson Education, Inc.


Chapter 10

Instructional Tips The objectives for the digital case in this chapter are to have students: 1. Understand that just having sample statistics does not mean that claims can be made about differences between groups without using hypothesis testing. 2. Use two-sample tests of hypothesis to determine whether there are significant differences between two groups. Solutions 1. Although the means of the two samples are different, without the necessary tests of hypothesis, you cannot infer that the two processes are statistically different. This, of course, assumes that CCACC has drawn random samples, something that is unclear in their posting. 2. Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference

0

Level of Significance

0.05

Population 1 Sample Sample Size

10

Sample Mean

372.485

Sample Standard Deviation

13.45716104

Population 2 Sample Sample Size

10

Sample Mean

365.549

Sample Standard Deviation

10.07565432

Intermediate Calculations Population 1 Sample Degrees of Freedom

9

Population 2 Sample Degrees of Freedom

9

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cxcix Total Degrees of Freedom

18

Pooled Variance

141.3070

Standard Error

5.3161

Difference in Sample Means

6.9360

t Test Statistic

1.3047 Two-Tail Test

Lower Critical Value

-2.1009

Upper Critical Value

2.1009

Upper-Tail Test Upper Critical Value

1.7341

p-Value

0.1042

Copyright ©2024 Pearson Education, Inc.


2. cont.

For Plant 1 and Plant 2 F Test for Differences in Two Variances

Data Level of Significance

0.05

Larger-Variance Sample Sample Size

10

Sample Variance

181.0951833

Smaller-Variance Sample Sample Size

10

Sample Variance

101.51881

Intermediate Calculations F Test Statistic

1.7839

Population 1 Sample Degrees of Freedom

9

Population 2 Sample Degrees of Freedom

9

Upper-Tail Test Upper Critical Value

3.1789

p-Value

0.2008

Two-Tail Test Upper Critical Value

4.0260

p-Value

0.4016

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cci 2. cont.

For Plant 1 and Plant 2 Wilcoxon Rank Sum Test

Data Level of Significance

0.05

Population 1 Sample Sample Size

10

Sum of Ranks

120

Population 2 Sample Sample Size

10

Sum of Ranks

90

Intermediate Calculations Total Sample Size n

20

T1 Test Statistic

120

T1 Mean

105

Standard Error of T1

13.2288

Z Test Statistic

1.1338934

Upper-Tail Test Upper Critical Value

1.6449

p-Value

0.1284

Do not reject the null hypothesis

The t-test for the difference between the means indicates a test statistic of tSTAT = 1.30 and a onetail p-value of 0.104. The F-test for the equality of variances indicates a test statistic FSTAT = 1.788 Copyright ©2024 Pearson Education, Inc.


and a two-tailed p-value of 0.40. The Wilcoxon rank sum test (covered in Section 12.6) indicates a test statistic of ZSTAT = 1.133 and a one-tail p-value of 0.128. Thus, there is insufficient statistical evidence to indicate any difference in the mean, median, or variability between Plant 1 and Plant 2.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cciii

Chapter 11

Instructional Tips The objectives for the digital case in this chapter are to have students: 1. Understand that just having sample statistics does not mean that claims can be made about differences between groups without using hypothesis testing. 2. Use the one-factor Analysis of Variance to determine whether there are significant differences between two groups. 3. See that there can be anomalies that can occur when analyzing data in which one analysis can lead to a certain conclusion, and a different analysis might lead to another conclusion. Solutions 1. Yes, because Oxford Snacks operates four plants, a careful examination would explore if there are differences among the four plants. A proper sample of the population of snack boxes would include boxes from all four plants. In addition, as in an earlier case, it is unclear if the CCACC sample is randomly drawn from all snack boxes available. From their posting, it seems as if their members actively excluded boxes from plants other than #1 and #2. 2. In order to determine whether there is a difference in the weights among the four plants, a onefactor analysis of variance needs to be done. Anova: Single Factor SUMMARY Groups

Count

Sum

Average

Variance

Plant 1

20

7448

372.4

132.1037

Plant 2

20

7324.07

366.2035

218.1177

Plant 3

20

7393.12

369.656

222.0002

Plant 4

20

7531.72

376.586

131.1284

df

MS

F

P-value

F crit

2.19132

0.095938

2.724946

ANOVA Source of Variation

SS

Between Groups

1155.949

3

385.3162

Within Groups

13363.65

76

175.8375

Total

14519.6

79

Copyright ©2024 Pearson Education, Inc.


Kruskal-Wallis Test of Snack Box Weights

Data Level of Significance

0.05

Intermediate Calculations Sum of Squared Ranks/Sample Size

134728.4

Sum of Sample Sizes

80

Number of groups

4

H Test Statistic

6.496991 Test Result

Critical Value

7.814725

p-Value

0.089781

Do not reject the null hypothesis

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccv 2. The ANOVA results with an FSTAT test statistic equal to 2.19 < 2.72 or a p-value = 0.0959 > 0.05, cont. indicates that there is insufficient evidence to conclude that there is a difference in the means of the four plants. The nonparametric Kruskal-Wallis test (covered in Section 12.5) provides similar results with a 2STAT test statistic = 6.497 < 7 815 or a p-value = 0.0898 > 0.05. Interestingly, had CCACC argued that something was amiss only in Plant 2, but not in Plants 1, 3, and 4, there is some evidence that this is the case. Using an a priori research hypothesis that focused on testing differences between plants 1, 3, and 4 as compared to plant 2, the following results are obtained. t-Test: Two-Sample Assuming Equal Variances

Plant 1, 3, 4

Plant 2

Mean

372.880667

366.2035

Variance

164.518528

218.1177

60

20

Observations Pooled Variance

3.

177.574747

Hypothesized Mean Difference

0

df

78

t Stat

1.94065012

P(T<=t) one-tail

0.02795698

t Critical one-tail

1.66462542

P(T<=t) two-tail

0.05591397

t Critical two-tail

1.99084752

Since tSTAT = 1.94 > 1.664 or the p-value = 0.028 < 0.05, there is evidence that the mean weight of snack boxes in plants 1, 3, and 4 is greater than the mean weight in plant 2. The one-way ANOVA shows that the null hypothesis cannot be rejected, so you cannot claim a statistical difference among the four plants. The mean weight of the 80 boxes in the sample is 371.2 grams, consistent with a claim that boxes average 368 grams. Interestingly, an analysis that pits Plant #2 against the other plants indicates that a statistically significant difference does occur. There may be something different happening in Plant #2, after all. That said, if the source of snack boxes for sale were randomly distributed, consumers would, over time, be unlikely to be ―cheated.‖ Quantifiable claims must be substantiated by the proper statistical analysis. While the CCACC may, in fact, have at least one valid point, the group cannot offer any legitimate evidence to support their claims. So, at least at this point, you should not testify on the group’s behalf.

Copyright ©2024 Pearson Education, Inc.


Chapter 12

Instructional Tips The objectives for the digital case in this chapter are to have students: 1. Understand the difference between the results from a one-way table and a two-way contingency table. 2. Be able to use the chi-square test to determine whether a relationship exists between two categorical variables. 3. Be able to see the importance of examining differences between groups in their response to a categorical variable. Solutions 1. They are literally true since 181 of the respondents prefer the Sun Low Concierge Class program as compared to 119 who prefer the T.C. Resorts TCRewards Plus program. However, since the program is described as aimed at business travelers, other interpretations of the data can be made. 2. By examining the preferences of business travelers, the target for the program, especially those business travelers who use travel programs, or by examining the resort last visited by type of traveler. 3. Program Preference by Travel Program Observed Frequencies Program Preference Uses Travel Program

TCRewardsPlus Concierge Class

Total

Yes

55

20

75

No

64

161

225

Total

119

181

300

Expected Frequencies Program Preference Uses Travel Program

TCRewards Plus

Concierge Class

Total

Yes

29.75

45.25

75

No

89.25

135.75

225

Total

119

181

300

Data

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccvii Level of Significance

0.05

Number of Rows

2

Number of Columns

2

Degrees of Freedom

1

Results Critical Value

3.841455338

Chi-Square Test Statistic

47.3606017

p-Value

5.90578E-12 Reject the null hypothesis

Expected frequency assumption is met.

Copyright ©2024 Pearson Education, Inc.


3. There is a significant difference in preference for TCRewards Plus versus Concierge Class based cont. on whether the respondent uses a travel rewards program (2STAT = 47.361 > 3.841, p-value = 0.000 < 0.05. Those who use travel rewards programs clearly prefer TCRewards Plus (73.3%) over Concierge Class, while those who do not use travel rewards programs prefer Concierge Class (71.6%). Program Preference by Travel Program Observed Frequencies Program Preference Customer Type

TCRewards Plus Concierge Class

Total

Business

34

16

50

Leisure

85

165

250

Total

119

181

300

Expected Frequencies Program Preference Customer Type

TCRewards Plus

Concierge Class

Business

19.83333333

30.16666667

50

Leisure

99.16666667

150.8333333

250

Total

119

181

300

Data Level of Significance

0.05

Number of Rows

2

Number of Columns

2

Degrees of Freedom

1

Results Critical Value

3.841455338

Chi-Square Test Statistic

20.12628256

p-Value

7.24936E-06

Copyright ©2024 Pearson Education, Inc.

Total


Solutions to End-of-Section and Chapter Review Problems ccix Reject the null hypothesis

Expected frequency assumption is met.

4.

There is a significant difference in preference for TCRewards Plus versus Concierge Class based on whether the respondent is a business or leisure traveler (2STAT = 20.126 > 3.841, p-value = 0.000 < 0.05. Business travelers clearly prefer TCReawrds Plus (68%) over Concierge Class, while leisure travelers prefer Concierge Class (66%). Further analysis indicates that of 41 business travelers who use travel reward programs, 31 prefer TCRewards Plus. Of 34 leisure travelers who use travel reward programs, 24 prefer TCRewards Plus. Thus, it is reasonable to conclude that TCRewards Plus is preferred by the target audience of business travelers and also by those who use travel reward programs. Among other factors that might be included in future surveys are whether the travel program influences the choice of accommodation, what attributes of a resort chain are desirable for business travelers, and the reasons for the attractiveness of Concierge Class for leisure travelers.

Copyright ©2024 Pearson Education, Inc.


Chapter 13

Instructional Tips The objectives for the digital case in this chapter are to have students: 1. Perform a simple linear regression analysis to determine the usefulness of an independent variable in predicting a dependent variable. 2. Understand the danger in making predictions that extrapolate beyond the range of the independent variable. Solutions 1. Regression Analysis Regression Statistics Multiple R

0.698234618

R Square

0.487531581

Adjusted R Square

0.44482588

Standard Error

2.234863491

Observations

14

ANOVA df

SS

MS

F 11.41607709

Regression

1

57.01890785

57.01890785

Residual

12

59.93537787

4.994614822

Total

13

116.9542857

Coefficients

Standard Error

Intercept

-1.941218839

Average Disposable Income($000)

0.192948059

t Stat

P-value

2.379988792

-0.815642009

0.430597414

0.05710603

3.378768576

0.005480622

Significance F 0.005480622

Yes, there is a correlation between the variables, but not a very strong one, given the r2 value of only 0.49. The sales projection claim should be discarded as Triangle is attempting to extrapolate sales outside the range of the X values. This raises a related point: Sunflowers clearly has not done Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccxi

2.

3.

4.

business in areas of ―exceptional affluence,‖ so there is no track record on which to base a decision to accept or reject Triangle’s proposal. No, because the r2 value of mean disposable income with sales is only 0.49 as compared to an r2 value of 0.904 for store size. In fact, a multiple regression analysis reveals that given that store size is included in the regression model, adding mean disposable income does not significantly improve the model. Yes, given the r2 value of only 0.49, it is less significant than other single factors such as store size. However, opening a new retail location would be based on a number of factors (some of these factors such as competitive retail analysis, demographic and geographic profiles, regional economic analysis, and sales potential forecast analysis, are actually mentioned by Triangle in its proposal). The Sunflowers brand perception and merchandise mix would be important as well. For example, a store selling hip junior swimsuit fashions would not do well in a community of senior citizens in wintry Minnesota. The financial health of the Sunflowers chain would be another factor—many retail chains have gone out of business due to unwise overexpansion.

Copyright ©2024 Pearson Education, Inc.


Chapter 14

Instructional Tips The objectives for the digital case in this chapter are to have students: 1.

Evaluate the contribution of dummy variables to a multiple regression model.

2.

Determine whether an interaction term needs to be included in a regression model that has a dummy variable.

Solutions 1. Multiple Regression Analysis: Sales vs Price, Promotion, Location, Digital Coupon (Location: 0 = Food, 1 = New Arrivals, Digital Coupon: 0 = No, 1 = Yes)

Regression Statistics Multiple R

0.9370

R Square

0.8780

Adjusted R Square

0.8623

Standard Error

454.2762

Observations

36

ANOVA df

SS

MS

F 55.8002

Regression

4

46061222.6158

11515305.6539

Residual

31

6397374.1342

206366.9076

Total

35

52458596.7500

Copyright ©2024 Pearson Education, Inc.

Significance F 0.0000


Solutions to End-of-Section and Chapter Review Problems ccxiii Coefficients

Standard Error

t Stat

P-value

Intercept

14910.2954

1159.9239

12.8545

0.0000

Price

-3460.5908

310.6291

-11.1406

0.0000

Promotion

6.3938

0.9835

6.5010

0.0000

Location

-843.0954

155.0591

-5.4373

0.0000

Digital Coupon

149.3491

158.0277

0.9451

0.3519

The presence of Digital Coupon does not make a significant contribution to the multiple regression model since the p-value = 0.3519 > 0.05. Therefore, it should be eliminated from consideration in the model.

Copyright ©2024 Pearson Education, Inc.


1.

Multiple Regression Analysis: Sales vs Price, Promotion, Location (0 = Food, 1 = New Arrivals)

cont. Regression Statistics Multiple R

0.9352

R Square

0.8745

Adjusted R Square

0.8628

Standard Error

453.5174

Observations

36

ANOVA

df

SS

MS

F 74.3507

Regression

3

45876899.8186

15292299.9395

Residual

32

6581696.9314

205678.0291

Total

35

52458596.7500

Coefficients

Standard Error

t Stat

P-value

Intercept

14962.5323

1156.6709

12.9359

0.0000

Price

-3439.7472

309.3276

-11.1201

0.0000

Promotion

6.1440

0.9457

6.4964

0.0000

Location

-843.8204

154.7982

-5.4511

0.0000

Significance F

Multiple Regression Analysis with Interaction Terms: Sales vs Price, Promotion, Location, Price*Location, Promotion*Location Regression Statistics Multiple R

0.9443

R Square

0.8916

Copyright ©2024 Pearson Education, Inc.

0.0000


Solutions to End-of-Section and Chapter Review Problems ccxv Adjusted R Square

0.8736

Standard Error

435.2978

Observations

36

ANOVA

df

SS

MS

F

Regression

5 46774071.3700 9354814.2740 49.3699

Residual

30

Total

35 52458596.7500

5684525.3800

189484.1793

Coefficients

Standard Error

Intercept

14455.9223

1490.7939

9.6968

0.0000

Price

-3197.0210

405.0027

-7.8938

0.0000

Promotion

4.3670

1.2364

3.5320

0.0014

Location

-60.0741

2239.5882

-0.0268

0.9788

Price*Location

-416.0789

597.9686

-0.6958

0.4919

3.7702

1.8287

2.0616

0.0480

Promotion*Location

t Stat

P-value

For the interaction term, Price*Location, tSTAT = –0.6958 with a p-value of 0.44919. Because pvalue > 0.05, do not reject H0. There is not sufficient evidence that the interaction term makes a significant contribution.

Copyright ©2024 Pearson Education, Inc.


1.

Multiple Regression Analysis with Interaction Term:

cont. Sales vs Price, Promotion, Location, Promotion*Location Regression Statistics Multiple R

0.9433

R Square

0.8899

Adjusted R Square

0.8757

Standard Error

431.6610

Observations

36

ANOVA

df

SS

MS

F 62.6335

Regression

4

46682329.5186

11670582.3797

Residual

31

5776267.2314

186331.2010

Total

35

52458596.7500

Coefficients

Standard Error

Intercept

15145.4672

1104.4378

13.7133

0.0000

Price

-3387.8897

295.4748

-11.4659

0.0000

4.4205

1.2237

3.6123

0.0011

-1594.2287

389.8476

-4.0894

0.0003

3.7703

1.8135

2.0791

0.0460

Promotion Location Promotion*Location

t Stat

P-value

Thus, there is a significant effect of location on sales with location having a positive effect on sales. However, the effect of the location is not the same across different levels of promotion with a slight decrease in its effect with increasing levels of promotion expenses. In addition, there is no evidence of any patterns in the residual plots.

2.

You would recommend using the location but not use digital coupons. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccxvii 3.

Actual sales by linear display feet (the linear size of the product stock area), the number of QV digital coupons used per store, the number of stores using digital coupons, and the amount or existence of special in-store signage or advertising panels.

Copyright ©2024 Pearson Education, Inc.


Chapter 15

Instructional Tips The objectives for the digital case in this chapter are to have students: 1. Be able to determine which one of a set of competing claims concerning regression results are correct. 2. Use model building approach to determine the best fitting model. 3. Evaluate the contribution of dummy variables to a multiple regression model. 4. Determine whether an interaction term needs to be included in a regression model that has a dummy variable. 5. Use the coefficient of partial determination to evaluate the importance of each independent variable. Solutions 1. Regression Analysis: Sales vs Price, Promotion, Location, Digital Coupon with VIF (Location: 0 = Food, 1 = New Arrivals, Digital Coupon: 0 = No, 1 = Yes) Regression Statistics Multiple R

0.9370

R Square

0.8780

Adjusted R Square

0.8623

Standard Error

454.2762

Observations

36

df

SS

MS

F

Regression

4 46061222.6158 11515305.6539 55.8002

Residual

31

Total

35 52458596.7500

6397374.1342

206366.9076

Coefficients

Standard Error

Intercept

14910.2954

1159.9239

12.8545

0.0000

Price

-3460.5908

310.6291

-11.1406

0.0000

1.0099

6.3938

0.9835

6.5010

0.0000

1.1250

Promotion

t Stat

Copyright ©2024 Pearson Education, Inc.

P-value

VIF


Solutions to End-of-Section and Chapter Review Problems ccxix Location

-843.0954

155.0591

-5.4373

0.0000

1.0486

Digital Coupon

149.3491

158.0277

0.9451

0.3519

1.0857

Multicollinearity is not an issue (all of the VIFs are small and less than 2). Digital coupon is not significant with p-value = 0.3519 > 0.05.

Copyright ©2024 Pearson Education, Inc.


1. Best Subsets Regression Analysis: cont. Sales vs Price (X1), Promotion (X2), Location(X3), Digital Coupon(X4) Best Subsets Analysis

Intermediate Calculations R2T

0.878049

1 - R2T

0.121951

n

36

T

5

n-T

31

Model

Cp

k+1

R Square

Adj. R Square

Std. Error

X1

89.7763

2

0.5209

0.5069

859.7297

X2

161.9328

2

0.2371

0.2146

1084.9413

X3

163.1847

2

0.2322

0.2096

1088.4374

X4

218.3295

2

0.0152

-0.0137

1232.6409

X1X2

31.5085

3

0.7580

0.7434

620.1982

X1X3

43.9560

3

0.7091

0.6914

680.0641

X1X4

90.3695

3

0.5265

0.4978

867.6035

X2X3

125.1366

3

0.3897

0.3527

984.9635

X2X4

163.9090

3

0.2372

0.1910

1101.1893

X3X4

162.8055

3

0.2415

0.1956

1098.0517

X1X2X3

3.8932

4

0.8745

0.8628

453.5174

X1X2X4

32.5637

4

0.7617

0.7394

624.9586

X1X3X4

45.2625

4

0.7118

0.6848

687.3624

X2X3X4

127.1127

4

0.3898

0.3326

1000.1582

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccxxi X1X2X3X4

5.0000

5

0.8780

0.8623

454.2762

Based on a best subsets regression and examination of the resulting Cp values, the best model appears to be a model with variables X1, X2 and X3, which has Cp = 3.8932. Models that add other variables do not change the results very much.

Copyright ©2024 Pearson Education, Inc.


1. Stepwise Regression: Sales vs Price (X1), Promotion (X2), Location(X3), Digital Coupon(X4) cont. Stepwise Regression Analysis Table of Results for General Stepwise

Price entered.

df

SS

MS

F

Regression

1 27328004.1667 27328004.1667 36.9729

Residual

34 25130592.5833

Total

35 52458596.7500

739135.0760

Coefficients

Standard Error

t Stat

P-value

Intercept

16201.8750

2163.2971

7.4894

0.0000

Price

-3556.9444

584.9719

-6.0805

0.0000

Promotion entered.

df

SS

MS

F

Regression

2 39765284.5417 19882642.2708 51.6908

Residual

33 12693312.2083

Total

35 52458596.7500

384645.8245

Coefficients

Standard Error

Intercept

14762.1250

1580.9818

9.3373

0.0000

Price

-3556.9444

421.9914

-8.4289

0.0000

7.1988

1.2660

5.6863

0.0000

Promotion

t Stat

Location entered.

Copyright ©2024 Pearson Education, Inc.

P-value


Solutions to End-of-Section and Chapter Review Problems ccxxiii df

SS

MS

F

Regression

3 45876899.8186 15292299.9395 74.3507

Residual

32

Total

35 52458596.7500

6581696.9314

205678.0291

Coefficients

Standard Error

t Stat

P-value

Intercept

14962.5323

1156.6709

12.9359

0.0000

Price

-3439.7472

309.3276

-11.1201

0.0000

Promotion

6.1440

0.9457

6.4964

0.0000

Location

-843.8204

154.7982

-5.4511

0.0000

Based on a stepwise regression analysis with all the original variables, only X1, X2 and X3 make a significant contribution to the model at the 0.05 level. Thus, the best model is the model using the price (X1), promotion (X2), and location (X3) should be included in the model. It appears that the predicted increase in sales from the Promotion is approximately $6 each promotion. 1. If only one independent variable could be used, the coefficient of partial determination would be cont. helpful in determining which independent variable explained the most variation in sales holding constant the effect of the other independent variables. Coefficients r2 Y1.234

0.800145312

r2 Y2.134

0.576863749

r2 Y3.124

0.488142201

r2 Y4.123

0.028005361

Price has the highest coefficient of partial determination followed by promotion expenses, shelf location, number of dispensers, and the interaction of promotion expenses and shelf location. Another approach would be to perform a cost-benefit analysis on each variable and use the results as a basis for selection. 2.

Stepwise and best subsets models both suggest that a model of Sales vs Price, Promotion, Location is best.

3.

Sales vs Price, Promotion, Location Regression Analysis Copyright ©2024 Pearson Education, Inc.


Regression Statistics Multiple R

0.9352

R Square

0.8745

Adjusted R Square

0.8628

Standard Error

453.5174

Observations

36

ANOVA df

SS

MS

F

Regression

3 45876899.8186 15292299.9395 74.3507

Residual

32

Total

35 52458596.7500

6581696.9314

205678.0291

Coefficients

Standard Error

t Stat

P-value

Intercept

14962.5323

1156.6709

12.9359

0.0000

Price

-3439.7472

309.3276

-11.1201

0.0000

Promotion

6.1440

0.9457

6.4964

0.0000

Location

-843.8204

154.7982

-5.4511

0.0000

The best linear model, is: Yˆ  14,962.53 – 3,439.7472 Price + 6.144 Promotion – 843.82 Location with r2 = 0.8745.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccxxv 3. Residual Analysis cont.

Copyright ©2024 Pearson Education, Inc.


3. cont.

4.

Deborah Clair stated ―the most striking thing is the only store with over 4000 unit sales and only 100 in promotional dollars was a store with in-store digital coupons.‖ From the analysis, it is clear that in-store digital coupons are not significant in sales.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccxxvii

Chapter 16

Instructional Tips The objectives for the digital case in this chapter are to have students: 1. Be able to develop a time-series forecasting model using quarterly data. 2. Be able to compare the results of two forecasts and plot the raw time-series data on a graph. 3. Interpret the results of the time-series forecasting model including the compound growth rate and the seasonal multiplier. Solutions 1. Oxford Glen Remodeling: Regression Analysis: log(OG) vs Coded Quarter, Q1, Q2, Q3 Regression Statistics Multiple R

0.9964

R Square

0.9927

Adjusted R Square

0.9901

Standard Error

0.0016

Observations

16

ANOVA

df

SS

MS

F

Regression

4

0.0039

0.0010 376.4251

Residual

11

0.0000

0.0000

Total

15

0.0039

Coefficients

Standard Error

t Stat

P-value

Intercept

4.9983

0.0011 4378.0460

0.0000

Coded Quarter

0.0030

0.0001

33.5374

0.0000

Q1

-0.0126

0.0012

-10.7820

0.0000

Q2

-0.0089

0.0012

-7.7671

0.0000

Q3

-0.0101

0.0011

-8.8098

0.0000

The regression model for the Oxford Glen Remodeling is Copyright ©2024 Pearson Education, Inc.


Log(revenue) = 4.9983 + 0.0030 Coded Quarter – 0.0126 Quarter 1 – 0.0089 Quarter 2 – 0.0101 Quarter 3 ˆ log10 1  0.0030; ˆ1  100.0030  1.00697 , then ( ˆ1  1)100%  0.697% log ˆ  0.0126; ˆ  100.0126  0.9714 , then ( ˆ  1)100%  2.862% 10

2

2

2

log10 ˆ3  0.0089; ˆ3  100.0089  0.9796 , then ( ˆ3  1)100%  2.040% log ˆ  0.0030; ˆ  100.0101  0.9771 , then ( ˆ  1)100%  2.289% 10

1. cont.

4

4

4

The interpretation of the slopes is as follows:  The estimated quarterly compound growth rate in revenue is 0.697%.  0.9714 is the seasonal multiplier for the first quarter as compared to the fourth quarter. Sales are 2.86% lower for the first quarter as compared to the fourth quarter.  0.9796 is the seasonal multiplier for the second quarter as compared to the fourth quarter. Sales are 2.04% lower for the second quarter as compared to the fourth quarter.  0.9771 is the seasonal multiplier for the third quarter as compared to the fourth quarter. Sales are 2.29% lower for the third quarter as compared to the fourth quarter. Sycamore Homes Remodelers: Regression Analysis: log(OG) vs Coded Quarter, Q1, Q2, Q3 Regression Statistics Multiple R

0.9724

R Square

0.9456

Adjusted R Square

0.9259

Standard Error

0.0178

Observations

16

ANOVA

df

SS

MS

Regression

4

0.0608

0.0152

Residual

11

0.0035

0.0003

Total

15

0.0643

Coefficients

Standard Error

Intercept

4.7546

Coded Quarter Q1

F 47.8242

t Stat

P-value

0.0126

376.0441

0.0000

-0.0085

0.0010

-8.4887

0.0000

-0.1607

0.0130

-12.4079

0.0000

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccxxix Q2

-0.1035

0.0128

-8.1077

0.0000

Q3

-0.0923

0.0126

-7.3031

0.0000

The regression model for the Sycamore Homes Remodelers is Log(revenue) = 4.7546 – 0.0085 Coded Quarter – 0.1607 Quarter 1 – 0.1035 Quarter 2 –0.0923 Quarter 3 log10 ˆ1  0.0085; ˆ1  100.0085  0.9807 , then ( ˆ1  1)100%  1.929% log ˆ  0.1607; ˆ  100.1607  0.6907 , then ( ˆ  1)100%  30.934% 10

2

2

2

log10 ˆ3  0.1035; ˆ3  100.1035  0.7880 , then ( ˆ3  1)100%  21.198% log ˆ  0.0923; ˆ  100.0923  0.80846 , then ( ˆ  1)100%  19.154% 10

4

4

4

The interpretation of the slopes is as follows:  The estimated quarterly compound growth rate in sales is –1.93%  0.6907 is the seasonal multiplier for the first quarter as compared to the fourth quarter. Sales are 30.93% lower for the first quarter as compared to the fourth quarter.  0.7880 is the seasonal multiplier for the second quarter as compared to the fourth quarter. Sales are 21.2% lower for the second quarter as compared to the fourth quarter.  0.8085 is the seasonal multiplier for the third quarter as compared to the fourth quarter. Sales are 19.15% lower for the third quarter as compared to the fourth quarter. These results refute the claims of the Sycamore Homes Remodelers. First, it is more appropriate to examine the data from four years than just the last year. Second, examining the data from the four years, the quarterly growth rate for the Oxford Glen Remodeling is +0.7% as compared to a negative growth rate of almost 2% for the Sycamore Homes Remodelers. Finally, the Oxford Glen Remodeling has small seasonal effects of 2 – 3 % as compared to the fourth quarter, while the Sycamore Homes Remodelers has large seasonal effects of between 19 and 31% as compared to the fourth quarter.

Copyright ©2024 Pearson Education, Inc.


2.

Oxford Glen Remodeling: Its steady continuous growth with little variability from season to season. Sycamore Homes Remodelers: The decline that occurred in year 3 did not continue. Sales in year 4 stabilized at about the same level as year 3.

3.

Among other variables might be the actual number of homes remodeled, the demographics of the home owners, and the number of repeat customers.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccxxxi

Chapter 20

Instructional Tips The objectives for the digital case in this chapter are to have students: 1. Be able to use several criteria to determine a chosen course of action. 2. Revise probabilities in light of new information and determine if a previous course of action selected has changed. 3. Realize that better is a subjective word in making decisions. Solutions 1. Probabilities & Payoffs Table: P

StraightDeal

Happy Bull

Worried Bear

fast expanding

0.1

150

1200

-300

expanding

0.2

100

600

-200

stable

0.5

95

-100

100

recession

0.2

80

-900

400

Statistics for:

StraightDeal

Happy Bull

Worried Bear

Expected Monetary Value

98.5

10

60

Variance

340.25

382900

50400

Standard Deviation

18.44586675

618.7891402

224.4994432

Coefficient of Variation

0.187267683

61.87891402

3.741657387

Return to Risk Ratio

5.339949668

0.016160594

0.267261242

StraightDeal Expected Opportunity Loss

271.5

Happy Bull 360

Worried Bear 310

EVPI

Better is a subjective term that cannot be solely determined by a statistical analysis. If you accept the probabilities of the various events, StraightDeal should be selected since it has the highest expected monetary value ($98.50), the highest return-to-risk ratio (5.34), and the lowest expected value of perfect information ($271.50). 2. Copyright ©2024 Pearson Education, Inc.


Bayes’ Theorem Calculations

Probabilities Event

Prior

Conditional

Joint

Revised

fast expanding

0.1

0.9

0.09

0.1765

expanding

0.2

0.75

0.15

0.2941

stable

0.5

0.5

0.25

0.4902

recession

0.2

0.1

0.02

0.0392

Total:

0.51

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccxxxiii 2. cont. Probabilities & Payoffs Table: P

StraightDeal

Happy Bull

Worried Bear

fast expanding

0.1765

150

1200

-300

expanding

0.2941

100

600

-200

stable

0.4902

95

-100

100

recession

0.0392

80

-900

400

Statistics for:

StraightDeal

Happy Bull

Worried Bear

Expected Monetary Value

105.59

303.96

-47.07

Variance

437.9369

304298.3184

36607.4151

Standard Deviation

20.92694196

551.6324124

191.3306434

Coefficient of Variation

0.198190567

1.814819096

-4.06481077

Return to Risk Ratio

5.045648819

0.551019108

-0.24601391

StraightDeal Expected Opportunity Loss

Happy Bull

347.37

149

Worried Bear 500.03

EVPI

Now the choice of which fund to invest in is much more difficult. Although Happy Bull has a higher expected monetary value than StraightDeal and a lower expected value of perfect information, it also has a much lower return-to-risk ratio. Perhaps a better approach would be to use the portfolio management approach covered in Section 5.2 to invest a proportion of assets in StraightDeal and a proportion in Happy Bull. For example, investing 70% in StraightDeal and 30% in Happy Bull would provide a portfolio expected return of $165.10 and a portfolio risk of $178.14, substantially below the standard deviation of Happy Bull of $551.63.

The Craybill Instrumentation Company Case

Chapter 15

Copyright ©2024 Pearson Education, Inc.


1.

(a)

Let Y = Sales, X1 = Wonderlic Personnel Test score, X2 = Strong-Campbell Interest Inventory Test score, X3 = experience, X4 = 1 with a degree in electrical engineering; 0 otherwise. Based on a full regression model involving all of the variables: All VIFs are less than 5. So there is no reason to suspect collinearity between any pair of variables. The best-subset approach yields the following models to be considered: Partial PHStat output from the best-subsets selection: Consider Model

Cp

k

R Square

Adj. R Square

Std. Error

This Model?

5

5

0.593228

0.552551222

11.74203

Yes

X1X2X4

3.101155

4

0.5922

0.562360664

11.6126

Yes

X2X3X4

3.001097

4

0.593217

0.563452639

11.59811

Yes

X2X4

1.101172

3

0.5922

0.572780469

11.47353

Yes

X1X2X3X4

Partial PHStat output of the full regression model: Coefficients

Standard Error

t Stat

P-value

Intercept

25.7683

13.9537

1.8467

0.0722

Wonder

-0.0134

0.4050

-0.0331

0.9737

SC

1.3514

0.1947

6.9407

0.0000

Experience

0.1682

0.5287

0.3180

0.7521

Engineer Dummy

7.2747

4.1011

1.7738

0.0837

Since the p-value for X1 and X3 are considerably larger than 0.05, they do not have significant effect individually on sales. The best model should include both X2 and X4. PHStat output of the model with only X2 and X4: Regression Statistics Multiple R

0.7695

R Square

0.5922

Adjusted R Square

0.5728

Standard Error

11.4735

Observations

45

ANOVA df

SS

MS

Regression

2

8029.0413

4014.5207

Residual

42

5528.9587

131.6419

Total

44

13558

Copyright ©2024 Pearson Education, Inc.

F 30.4958

Significance F 6.59784E-09


Solutions to End-of-Section and Chapter Review Problems ccxxxv

Intercept

Coefficient s 26.8910

Standard Error

t Stat

P-value

9.7718

2.7519

0.0087

SC

1.3408

0.1792

7.4824

0.0000

Engineer Dummy

7.2869

3.9857

1.8282

0.0746

Copyright ©2024 Pearson Education, Inc.


1. (a) cont.

(b) (c) (d) (e)

Although the p-value for the Engineer Dummy variable is also > .05, it is close enough to .05 to retain the variable in the model because the managers consider it important. Therefore, the most appropriate model to predict sales is Yˆ  26.8910  1.3408 X 2  7.2869 X 4

With the exception of one single residual point at a value of –44.58 when SC = 54, there is no specific pattern in the residual plot. According to the finding in (a), the company only needs to administer the Strong-Campbell test. According to the model in (a), the variable X4 helps predict sales and, hence, the idea of only hiring electrical engineers should be supported. Prior selling experience (X3) does not help predict sales according to the model chosen in (a). The company only needs to administer the Strong-Campbell test to save time and money. It should consider giving hiring preference to sales managers with an electrical engineering degree.

The Mountain States Potato Company Case

Chapter 15

The independent variables involved are the pH of the filter cake (PH), the pressure in the vacuum line below the fluid line on the rotating drum (LOWER), the pressure of the vacuum line above the fluid line on the rotating drum (UPPER), cake thickness measured on the drum (THICK), setting used to control the drum speed (VARIDRIV), and the speed at which the drum was rotated when collecting filter cake (DRUMSPD). These data are contained in the POTATO file. We begin our analysis of the potato processing data by first measuring the amount of collinearity that exists between the explanatory variables through the use of the variance inflationary factor. The following Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccxxxvii figure represents partial PHStat output for a multiple linear regression model in which the percent of solids is predicted from the six explanatory variables. We observe that four of the VIF are above 5.0, ranging from 9.9 for Varidriv to 8.4 for Upper. Thus, based on the criteria developed by Snee, there is evidence of collinearity among at least some of the explanatory variables. A reasonable strategy is to remove the independent variable with the largest VIF above 5, and determine what effect this has on the VIF of the remaining independent variables. Regression Analysis

Regression Analysis

PH and all other X

Lower Pressure and all other X Regression Statistics

Regression Statistics Multiple R

0.561502

Multiple R

0.939267

R Square

0.315284

R Square

0.882222

Adjusted R Square

0.243959

Adjusted R Square

0.869954

Standard Error

0.232255

Standard Error

0.716466

Observations

54

Observations

54

VIF

1.460459

VIF

8.490574

Regression Analysis

Regression Analysis

Upper Pressure and all other X

Cake Thickness and all other X

Regression Statistics

Regression Statistics

Multiple R

0.93823

Multiple R

0.616598

R Square

0.880276

R Square

0.380193

Adjusted R Square

0.867805

Adjusted R Square

0.31563

Standard Error

0.766658

Standard Error

0.108131

Observations

54

Observations

54

VIF

8.352558

VIF

1.613407

Regression Analysis

Regression Analysis

Varidriv speed and all other X

Drum speed setting and all other X

Regression Statistics

Regression Statistics

Multiple R

0.948243

Multiple R

0.946259

R Square

0.899165

R Square

0.895406

Adjusted R Square

0.888661

Adjusted R Square

0.884511

Copyright ©2024 Pearson Education, Inc.


Standard Error

0.179475

Standard Error

2.150324

Observations

54

Observations

54

VIF

9.917201

VIF

9.560793

Figure 1 Regression Model obtained from PHStat2 to Predict Percent of Solids based on Six Explanatory Variables The following figure represents the regression model obtained from PHStat with only the Varidriv variable removed from the model. Regression Analysis

Regression Analysis

PH and all other X

Lower Pressure and all other X

Regression Statistics

Regression Statistics

Multiple R

0.527979

Multiple R

0.938614

R Square

0.278761

R Square

0.880996

Adjusted R Square

0.219885

Adjusted R Square

0.871281

Standard Error

0.235924

Standard Error

0.7128

Observations

54

Observations

54

VIF

1.386504

VIF

8.403063

Regression Analysis

Regression Analysis

Upper Pressure and all other X

Cake Thickness and all other X

Regression Statistics

Regression Statistics

Multiple R

0.937879

Multiple R

0.614269

R Square

0.879617

R Square

0.377327

Adjusted R Square

0.869789

Adjusted R Square

0.326496

Standard Error

0.760882

Standard Error

0.10727

Observations

54

Observations

54

VIF

8.306799

VIF

Regression Analysis Drum speed setting and all other X Regression Statistics Multiple R

0.330551

Copyright ©2024 Pearson Education, Inc.

1.605978


Solutions to End-of-Section and Chapter Review Problems ccxxxix R Square

0.109264

Adjusted R Square

0.036551

Standard Error

6.210805

Observations

54

VIF

1.122667

Figure 2 Regression Model obtained from PHStat2 to Predict Percent of Solids based on Five Explanatory Variables excluding the Varidriv independent variable From the figure above, we see that the VIF for the Drumspeed independent variable has been reduced from 9.6 to 1.1. This indicates that Varidriv and Drumspeed were very correlated with each other, but uncorrelated with the other independent variables. However, we also observe that the VIF values for Lower and Upper are still above 5, being equal to 8.4 and 8.3 respectively. Using the criteria of removing the independent variable with the highest VIF above five, we can remove the Lower independent variable from the model. The following figure represents PHStat output for a model that has excluded the Lower and Varidriv independent variables.

Copyright ©2024 Pearson Education, Inc.


Regression Analysis

Regression Analysis

PH and all other X

Upper Pressure and all other X

Regression Statistics

Regression Statistics

Multiple R 0.516068032

Multiple R

0.530366

R Square

0.266326213

R Square

0.281288

Adjusted R Square

0.222305786

Adjusted R Square

0.238165

Standard Error

0.235557799

Standard Error

1.840452

Observati ons

54

Observations

54

VIF

1.363003583

VIF

1.391378

Regression Analysis

Regression Analysis

Cake Thickness and all other X

Drum speed setting and all other X

Regression Statistics

Regression Statistics

Multiple R 0.603839337

Multiple R

0.29969

R Square

0.364621944

R Square

0.089814

Adjusted R Square

0.326499261

Adjusted R Square

0.035203

Standard Error

0.10726935

Standard Error

6.215147

Observati ons

54

Observations

54

VIF

1.573866128

VIF

1.098677

Figure 3 Regression Model obtained from PHStat2 to Predict Percent of Solids based on Four Explanatory Variables excluding the Lower and Varidriv independent variables From the figure above, we see that none of the remaining four independent variables has a VIF value above 1.6. The Lower independent variable was undoubtedly highly correlated with the Upper independent variable and its removal left us with four relatively uncorrelated independent variables, pH, Upper, Thick, and Drumspeed.

The Stepwise Regression Approach to Model Building

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccxli We now continue our analysis of these data by attempting to determine the explanatory variables that might be deleted from the complete model. We shall first utilize stepwise regression. The figure below represents a partial output obtained from the PHStat add-in for Microsoft Excel for the potato processing data.

Copyright ©2024 Pearson Education, Inc.


Stepwise Analysis Table of Results for General Stepwise PH entered.

df

SS

MS

Regression

1

25.35907426 25.35907426

Residual

52

74.61796278 1.434960823

Total

53

99.97703704

Coefficients

Standard Error

t Stat

F

Significance F

17.67231123

0.000103538

P-value

Lower 95%

Upper 95%

Intercept

0.782076396

2.45805091 0.318169324

0.751630817 -4.150360268

5.714513059

PH

2.589618022

0.616011802 4.203844815

0.000103538

1.353500744

3.825735299

F

Significance F

18.07013179

1.16804E-06

P-value

Lower 95%

Upper 95%

Upper Pressure entered.

df

SS

MS

Regression

2

41.46414438 20.73207219

Residual

51

58.51289266 1.147311621

Total

53

99.97703704

Coefficients

Standard Error

t Stat

Intercept

3.839576804

2.344527788 1.637675963

0.107645513

-0.86725551

8.546409118

PH

2.834310878

0.554678372 5.109827637

4.87591E-06

1.720748436

3.947873319

Upper Pressure

-0.263257541

0.070265187 -3.74662834

0.00045751 -0.404320681

-0.1221944

No other variables could be entered into the model. Stepwise ends.

Figure 4 Stepwise regression output obtained from the PHStat2 add-in for Microsoft Excel for the potato processing data For this example, a significance level of .05 was utilized either to enter a variable into the model or to delete a variable from the model. The first variable entered into the model is pH. Since the p-value of .0001 is less than .05, pH is included in the regression model. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccxliii The next step involves the evaluation of the second variable to be included in this model. The variable to be chosen is the one will make the largest contribution to the model, given that the first explanatory variable has already been selected. For this model, the second variable is Upper pressure. Since the p-value of .00046 for Upper pressure is less than .05, Upper pressure is included in the regression model. Now that Upper pressure has been entered into the model, we determine whether pH is still an important contributing variable or whether it may be eliminated from the model. Since the p-value of .000004876 (4.87591E-06 in scientific notation) for pH is also less than .05, pH should remain in the regression model. The next step involves the determination of whether any of the remaining variables should be added to the model. Since none of the other variables meet the .05 criterion for entry into the model, the stepwise procedure terminates with a model that includes pH and Upper pressure.

Copyright ©2024 Pearson Education, Inc.


The Best Subset Approach to Model Building The best subset approach evaluates either all possible regression models for a given set of independent variables or at least the best subset of models for a given number of independent variables. The figure below represents partial output obtained from the PHStat2 add-in for Microsoft Excel in which all regression models for a given number of parameters were evaluated according to two widely used criteria, the adjusted r2 and the Cp statistic. Best Subsets Analysis

Intermediate Calculations R2T

0.428728

1 - R2T

0.571272

N

54

T

5

n-T

49 Consider Model

Cp

k

R Square Adj. R Square Std. Error This Model?

X1

14.0171

2

0.253649

0.239296084 1.197899

No

X1X2

2.200053

3

0.414737

0.391785177 1.071126

Yes

X1X2X3

3.038685

4

0.428277

0.393973229 1.069198

Yes

5

5

0.428728

0.382093161 1.079627

Yes

X1X2X4

4.136941

4

0.415472

0.380400828 1.081104

No

X1X3

15.01919

3

0.265283

0.236470785 1.200121

No

X1X3X4

16.79688

4

0.267875

0.223947575 1.209923

No

X1X4

15.80859

3

0.25608

0.226906562 1.207614

No

X2

25.90084

2

0.115101

0.098083637 1.304354

No

X2X3

25.97922

3

0.137504

0.103681097

1.3003

No

X2X3X4

27.03666

4

0.148493

0.097402952 1.304846

No

X2X4

26.46649

3

0.131823

0.097777342 1.304575

No

X3

28.92168

2

0.079882

0.062187562 1.330057

No

X3X4

30.54398

3

0.084286

0.048375212 1.339816

No

X1X2X3X4

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccxlv X4

34.92666

2

0.009872 -0.009168586

1.37973

No

Figure 5 Best subsets regression output obtained from the PHStat2 add-in for Microsoft Excel for the potato processing data The first criterion that is often used is the adjusted r2, which adjusts the r2 of each model to account for the number of variables in the model. Since models with different numbers of independent variables are to be compared, the adjusted r2 is the appropriate criterion here rather than r2. Referring to the figure above, we observe that the adjusted r2 reaches a maximum value of .39397 for the model that includes the independent variables pH, Upper, and Thick plus the intercept term (for a total of four terms). We note that the model selected by using stepwise regression, that includes pH and Upper has an adjusted r2 of .39179. Thus, the best subset approach, unlike stepwise regression, has provided us with several alternative models to evaluate in greater depth using other criteria such as parsimony, interpretability, and departure from model assumptions (as evaluated by residual analysis). A second criterion often used in the evaluation of competing models is based on the Cp statistic developed by Mallows. When a regression model with k independent variables contains only random differences from a true model, the average value of Cp is k + 1, the number of parameters. Thus, in evaluating many alternative regression models our goal is to find models whose Cp is close to or below k+ 1. From the previous figure, we observe that there are three models that contains a Cp value equal to or below k + 1. These are the models with X1 and X2, with X1, X2, and X3, and with X1, X2, X3, and X4. Since the models with X1 and X2 and with X1, X2, and X3 have fewer variables and also have Cp less than k + 1, we will focus on these two models. One approach for choosing between models that meet the criteria of Cp less than k + 1 is to determine whether the models contain a subset of variables that are common, and then test whether the contribution of the additional variables is significant. In this case, that would mean testing whether variable X3 made a significant contribution to the regression model given that variables X1 and X2 were already included in the model. If the contribution was statistically significant, then variable X3 would be included in the regression model. If variable X3 did not make a statistically significant contribution, variable X3 would not be included in the model. The following figure represents a regression model that includes variables X1, X2, and X3 (pH, Upper, and Thick). Regression Analysis Regression Statistics Multiple R

0.654428477

R Square

0.428276632

Adjusted R Square

0.393973229

Standard Error

1.069197909

Observations

54

ANOVA Copyright ©2024 Pearson Education, Inc.


df

SS

MS

F

Regression

3

42.81782866 14.27260955 12.48496083

Residual

50

57.15920838 1.143184168

Total

53

99.97703704

Coefficients

Standard Error

t Stat

P-value

Significance F 3.26679E-06

Lower 95%

Upper 95%

Intercept

2.625587479

2.592611079 1.012719378 0.316070006 -2.581827253 7.833002211

PH

3.148146196

0.624290082 5.042761826 6.41161E-06

Upper Pressure

-0.309346256

0.081934686 -3.775522569 0.000424913 -0.473916983 -0.144775529

Cake Thickness

1.531919658

1.407781964 1.088179631 0.281733384 -1.295694788 4.359534103

1.89422215 4.402070241

Figure 6 Regression Model obtained from PHStat2 to Predict Percent of Solids based on Three Explanatory Variables including the pH, Upper, and Thick independent variables From this figure, we observe that Thick (X3) has a t value of 1.09 and a p-value of .282. Since the p-value of .282 > .05, we can conclude that Thick (X3) does not make a significant contribution to the regression model given that pH (X1 ) and Upper pressure (X2) are included. Therefore, a reasonable approach is to eliminate Thick (X3) from the model and fit the regression model that includes pH (X1 ) and Upper pressure (X2). The following figure represents PHStat output for this model.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccxlvii Regression Analysis

Regression Statistics Multiple R

0.644000528

R Square

0.41473668

Adjusted R Square

0.391785177

Standard Error 1.071126333 Observations

54

ANOVA df

SS

MS

F

Regression

2

41.46414438 20.73207219 18.07013179

Residual

51

58.51289266 1.147311621

Total

53

99.97703704

Coefficients

Standard Error

t Stat

P-value

Significance F 1.16804E-06

Lower 95%

Upper 95%

Intercept

3.839576804

2.344527788 1.637675963 0.107645513

-0.86725551 8.546409118

PH

2.834310878

0.554678372 5.109827637 4.87591E-06 1.720748436 3.947873319

Upper Pressure

-0.263257541

0.070265187 -3.74662834

0.00045751 -0.404320681

-0.1221944

Figure 7 Regression Model obtained from PHStat2 to Predict Percent of Solids based on Two Explanatory Variables including the pH, and Upper independent variables The following residual plots do not suggest any need for non-linear transformation. The Durbin-Watson statistic of 1.5509 is greater than dU  1.47 at 10% level of significance. So there is not sufficient evidence to conclude that there is negative autocorrelation in the model. Thus, we can conclude that raising the pH and/or reducing the Upper pressure should result in an increased percentage of solids.

Copyright ©2024 Pearson Education, Inc.


Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccxlix

Chapter 18

JMP output of the regression tree:

According to the partion of the regression tree for solids, the first split occurs at pH level = 4.2. When pH level  4.2, the percentage of solids in the filter cakes is higher with a mean of 12.28%. Among those with pH level  4.2, the next split occurs at drum speed = 37.45. Those with drum speed  37.45 have a higher mean percentage of solids at 13.92%. All the other splits produce lower mean percentage of solids. Hence, to maintain the highest percentage of solids in the filter cakes, it is recommended that the pH level be set at  4.2 with a drum speed setting at  37.45.

The O. Hara Performance Consulting Case

Chapter 13

1.

Simple Regression Analysis: Internal Rating vs WGCTA Score Regression Statistics Copyright ©2024 Pearson Education, Inc.


Multiple R

0.8961

R Square

0.8030

Adjusted R Square

0.7960

Standard Error

0.3651

Observations

30

ANOVA df

SS

MS

F

Regression

1

15.2150 15.2150 114.1619

Residual

28

3.7317

Total

29

18.9467

Coefficients

Standard Error

t Stat

P-value

Intercept

0.0934

0.7221

0.1293

0.8981

WGCTA Score

0.0965

0.0090 10.6847

0.0000

0.1333

Rating = b0  b1  WGCTA   0.0934  0.0965  WGCTA  The p-value = 0.0000 < 0.05 for the t-test for the significance of the slope coefficient. Reject H0 and conclude that WGCTA score is significant in predicting job performance. 2.

Rating = b0  b1  WGCTA   0.0934  0.0965  89   8.6836. 8.4625  Rating|WGCTA=89  8.9047 7.9038  Rating|WGCTA=89  9.4634

3.

Since a WGCTA score of 89 falls outside the domain of the WGCTA scores, you should be concerned that the linear relationship that exists between rating and WGCTA scores might not continue to hold true outside the domain.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccli 4.

The normality assumption of the errors might have been slightly violated. Copyright ©2024 Pearson Education, Inc.


The Sure Value Convenience Stores Case

Chapter 8

S  90   978  2.0345   950.08    1005.92 n  43 

1.

X t

2.

Based on the evidence gathered from the sample of 43 stores, the 95% confidence interval for the mean per-store count in all of the franchise’s stores is from 950.08 to 1005.92. With a 95% level of confidence, the franchise can conclude that the mean per-store count in all of its stores is somewhere between 950.08 and 1005.92, which is larger than the original average of 925 mean per-store count before the price reduction. Hence, reducing coffee prices is a good strategy to increase the mean customer count.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccliii

Chapter 9

1.PHStat output: t Test for Hypothesis of the Mean

Data Null Hypothesis

=

Level of Significance

925 0.01

Sample Size

43

Sample Mean

978

Sample Standard Deviation

90

Intermediate Calculations Standard Error of the Mean Degrees of Freedom

13.7249 42

t Test Statistic

3.8616

Upper-Tail Test Upper Critical Value

2.4185

p-Value

0.0000 Reject the null hypothesis

H0:   925 The mean customer count is not more than 925. H1:  > 925 The mean customer count is more than 925. A Type I error occurs when you conclude the mean customer count is more than 925 when in fact the mean number is not more than 925. A Type II error occurs when you conclude the mean customer count is not more than 925 when in fact the mean number is more than 925. Copyright ©2024 Pearson Education, Inc.


Decision rule: If tSTAT > 2.4185 or when the p-value < 0.01, reject H0. X –  978 – 925 Test statistic: tSTAT  = 3.8616,p-value is virtually 0.  S 90 n 43 Decision: Since tSTAT = 3.8616 is greater than 2.4185 or the p-value is less than 0.01, reject H0. There is enough evidence to conclude that reducing coffee prices is a good strategy for increasing the mean customer count. When the null hypothesis is true, the probability of obtaining a sample whose mean is 978 or more is virtually 0.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cclv

Chapter 10

1.

Stores that priced the ―short‖ coffee at $0.99 H 0 :   925 vs. H1 :   925 t Test for Hypothesis of the Mean

Data Null Hypothesis

=

925

Level of Significance

0.05

Sample Size

15

Sample Mean

972

Sample Standard Deviation

85

Intermediate Calculations Standard Error of the Mean

21.9469

Degrees of Freedom

14

t Test Statistic

2.1415

Upper-Tail Test Upper Critical Value

1.7613

p-Value

0.0252 Reject the null hypothesis

Since the p-value = 0.0252 < 0.05, reject H 0 . There is evidence that reducing the price of a ―short‖ coffee to $0.99 increases per store average daily customer count. Stores that priced the small coffee at $1.09 H 0 :   925 vs. H1 :   925 t Test for Hypothesis of the Mean

Copyright ©2024 Pearson Education, Inc.


Data Null Hypothesis

=

925

Level of Significance

0.05

Sample Size

15

Sample Mean

951

Sample Standard Deviation

64

Intermediate Calculations Standard Error of the Mean

16.5247

Degrees of Freedom

14

t Test Statistic

1.5734

Upper-Tail Test Upper Critical Value

1.7613

p-Value

0.0690

Do not reject the null hypothesis

Since the p-value = 0.0690 > 0.05, do not reject H 0 . There is insufficient evidence that reducing the price of a ―short‖ coffee to $1.09 increases per store average daily customer count.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cclvii 2.

H0 : 12   22 vs. H1 : 12   22 F Test for Difference in Two Variances

Data Level of Significance

0.05

Large-Variance Sample Sample Size

15

Sample Standard Deviation

85

Small-Variance Sample Sample Size

15

Sample Standard Deviation

64

Intermediate Calculations F Test Statistic

1.3281

Population 1 Sample Degrees of Freedom

14

Population 2 Sample Degrees of Freedom

14

Two-Tail Test Upper Critical Value

2.9786

p-Value

0.6026 Do not reject the null hypothesis

Since the p-value = 0.6026 > 0.05, do not reject H 0 . There is not enough evidence that the two variances are different. Hence, a pooled-variance t test is appropriate.

Copyright ©2024 Pearson Education, Inc.


2.

H 0 : 1  2 vs. H1 : 1  2

cont. Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances)

Data Hypothesized Difference

0

Level of Significance

0.05

Population 1 Sample Sample Size

15

Sample Mean

972

Sample Standard Deviation

85

Population 2 Sample Sample Size

15

Sample Mean

951

Sample Standard Deviation

64

Intermediate Calculations Population 1 Sample Degrees of Freedom

14

Population 2 Sample Degrees of Freedom

14

Total Degrees of Freedom

28

Pooled Variance

5660.5000

Difference in Sample Means

21.0000

t Test Statistic

0.7644

Two-Tail Test

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cclix Lower Critical Value

–2.0484

Upper Critical Value

2.0484

p-Value

0.4510 Do not reject the null hypothesis

Since the p-value = 0.4510 > 0.05, do not reject H 0 . There is not enough evidence of a difference in the per store daily customer count between stores in which a small coffee was priced at $0.99 and stores in which a ―short‖ coffee was priced at $1.09 for an eight-ounce cup. 3.

Since there is not enough evidence of a difference in the per store mean daily customer count between stores in which a small coffee was priced at $0.99 and stores in which a ―short‖ coffee was priced at $1.09 for an eight-ounce cup, you will recommend that a ―short‖ coffee should be priced at $1.09 since that will bring in more profit per cup.

Copyright ©2024 Pearson Education, Inc.


Chapter 11

1.

H0: 12   22   32   42 H1: At least one variance is different. Excel output for Levene’s test for homogeneity of variance: Source of Variation

SS

df

MS

F

Between Groups

22848.6136

3

7616.2045

0.8574

Within Groups

355296.5455

40

8882.4136

Total

378145.1591

43

P-value

F crit

0.4710 2.8387

Level of significance

0.05

Since the p-value = 0.4710 > 0.05, do not reject H0. There is not enough evidence of a difference in the variation in daily customer count among the different prices.

You can perform the one-way ANOVA F test for the difference in means. H0: 1  2  3  4 where 1 = $0.99, 2 = $1.09, 3 = $1.19, 4 = $1.29 H1: At least one mean is different. Decision rule: df: 3,40. If F > 2.84, reject H0. Excel output: ANOVA Source of Variation

SS

df

MS

Between Groups

1544831.7273

3 514943.9091

Within Groups

821239.4545

40

Total

2366071.1818

43

F 25.0813

P-value

F crit

0.0000 2.8387

20530.9864

Level of significance

0.05

Test statistic: FSTAT = 25.0813 Decision: Since FSTAT = 25.0813 is greater than the critical bound of 2.84, reject H0. There is evidence of a difference in the mean daily customer count based on the price of a short coffee.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cclxi 2.

To determine which of the means are significantly different from one another, you perform the Tukey-Kramer procedure. PHStat output:

Tukey-Kramer Multiple Comparisons

Group 1: $0.99 Sales 2: $1.09 Sales 3: $1.19 Sales 4: $1.29 Sales

Sample Sample Mean Size 1630.273 11 1973.364 11 1599.636 11 1465.273 11

Other Data Level of significance 0.05 Numerator d.f. 4 Denominator d.f. 40 MSW 20530.99 Q Statistic 3.79

Absolute Std. Error Critical Comparison Difference of Difference Range Results Group 1 to Group 2 343.0909 43.20246875 163.7 Means are different Group 1 to Group 3 30.63636 43.20246875 163.7 Means are not different Group 1 to Group 4 165 43.20246875 163.7 Means are different Group 2 to Group 3 373.7273 43.20246875 163.7 Means are different Group 2 to Group 4 508.0909 43.20246875 163.7 Means are different Group 3 to Group 4 134.3636 43.20246875 163.7 Means are not different

The means are all mostly different among the different prices. In ascending order, they are $1.29, $1.19, $0.99 and $1.09. 3.

If the objective is to maximize the mean daily customer counts, the store should sell the short coffee at $1.09. Even though the mean daily customer counts are highest when the short coffee price is at $1.09, to determine the optimal price to maximize profit, you will need to know the cost of the coffee.

Copyright ©2024 Pearson Education, Inc.


Chapter 12

PHStat output: Kruskal-Wallis Rank Test for Differences in Medians Data Level of Significance

0.05

Intermediate Calculations Sum of Squared Ranks/Sample Size

26700.41

Sum of Sample Sizes

44

Number of Groups

4 Test Result

H Test Statistic

26.8207

Critical Value

7.8147

p-Value

0.0000

Reject the null hypothesis

(a)

H0: M1 = M2 = M3= M4 H1: At least one of the medians differs. Since the p-value is virtually 0 < 0.05, reject H0. There is enough evidence of a difference in the median daily customer count based on the price of a cup of short coffee.

(b)

Even though you can conclude that there is enough evidence of a difference in the median daily customer count based on the price of a cup of short coffee, you cannot determine which price is optimal to maximize the median daily customer count. From Chapter 11, you have found out that the price to maximize mean daily customer count is $1.09.

The Choice Is Yours/More Descriptive Choices Follow-up Case

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cclxiii Chapter 2

1.

A complete answer should compile descriptive statistics for each type and compare. (This should be a pivot table.) ―Blend‖ funds are a blend of growth and value funds and would be expected to have returns somewhere between the other two. A pivot using the mean returns confirms this: Mean of 1Yr Return

Mean of 3Yr Return

Mean of 5Yr Return

Mean of 10Yr Return

Growth

-1.99

16.76

16.09

13.75

Value

16.18

12.57

9.46

10.89

Blend

10.21

14.05

11.43

11.82

Grand Total

7.42

14.63

12.59

12.27

Row Labels

There are no other strong patterns other than growth funds having greater returns than the other categories over longer periods of time. 2.

There are two types of answers possible: (1) What constitutes a ―better offering‖ would depend on information about individual investors that is not available. (2) Blend is a more balanced approach which likely will appeal to conservative investors.

Copyright ©2024 Pearson Education, Inc.


Chapter 3

More Descriptive Choices Follow-up

Redo Example 3.5 for 3-Yr Return by Type Mean 3Yr Return Type

Risk Level Low

Average

High

Grand Total

Growth

17.66

16.91

15.64

16.76

Small

14.42

15.16

14.11

14.47

Mid-Cap

15.58

14.85

15.14

15.12

Large

19.05

18.48

17.55

18.48

12.69

12.90

12.03

12.57

Small

11.33

12.15

11.09

11.44

Mid-Cap

11.42

13.40

11.92

12.31

Large

13.34

12.91

12.60

12.97

13.99

14.47

13.42

14.05

Small

9.58

10.66

10.74

10.44

Mid-Cap

12.06

12.90

12.11

12.53

Large

16.35

16.59

16.53

16.51

Grand Total

15.01

14.92

13.82

14.63

Value

Blend

The results are very similar to the results from Example 3.5 for the growth and value funds. Overall, the three-year return for the blend fund is between the value and growth funds. However, there are exceptions. For small cap at all the different risk levels, (and mid-cap at average risk level) the blend fund has a lower mean than both the value and growth funds. For mid-cap, at the low and high-risk level, the blend fund is between the value and growth funds. For large cap at all the different risk levels, the blend fund is between the value and growth funds. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cclxv Redo Example 3.10 for 3-Yr Return by Type: Descriptive Summary Value

Blend

Growth

Mean

12.57

14.05

16.76

Median

12.46

14.40

16.79

Mode

13.39

17.96

16.37

Minimum

4.00

-11.54

-5.66

Maximum

24.78

24.10

49.84

Range

20.78

35.64

55.5

Variance

8.2900

18.1555

25.4926

Standard Deviation

2.8792

4.2609

5.0490

Coeff. of Variation

22.90%

30.32%

30.13%

Skewness

0.4545

-1.6355

0.9852

Kurtosis

1.1577

8.1970

10.7005

Count

326

368

413

0.1595

0.2221

0.2484

Standard Error

The blend fund has a median of 14.40%, which is between the value fund of 12.46% and the growth fund of 16.79%. The blend fund had a standard deviation also between the value or growth funds, with 4.2609% for blend and 2.8792% and 5.0490% for value and growth, respectively. While the growth and value funds each show right or positive skewness, the blend fund shows left or negative skewness. The value, blend, and growth funds all show positive kurtosis, with the blend fund between the value and growth funds.

Copyright ©2024 Pearson Education, Inc.


Redo Example 3.14 for 3-Yr Return by Type: Five-Number Summary and Boxplot Five-Number Summary Value Minimum

Blend

Growth

4

-11.54

-5.66

First Quartile

10.65

11.48

14.085

Median

12.455

14.4

16.79

Third Quartile

13.91

17.17

19.195

Maximum

24.78

24.1

49.84

From the five-number summary, blend fund is not between value and growth for minimum nor maximum. The blend fund is between value and growth for first quartile, (10.65%, 11.48%, 14.085%) for median (12.455%, 14.4%, 16.79%), and for third quartile (13.91%, 17.17%, 19.195%).

The boxplot for the blend growth, and value funds confirm our results here and in redo of Example 3.10.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cclxvii

Chapter 3

The Choice Is Yours Follow-up

1.

Descriptive Summary for 1-Yr Return Percentage Value

Blend

Growth

Mean

16.18

10.21

-1.99

Median

16.29

11.38

0.14

Mode

17.35

10.45

10.76

Minimum

-11.26

-23.80

-47.37

Maximum

31.16

40.49

17.41

Range

42.42

64.29

64.78

Variance

20.7292

55.7301

153.4057

Standard Deviation

4.5529

7.4653

12.3857

Coeff. of Variation

28.15%

73.11%

-623.43%

Skewness

-0.5865

-0.4418

-0.8948

Kurtosis

4.2388

3.5711

0.8148

Count

326

368

413

Standard Error

0.2522

0.3892

0.6095

First Quartile

13.76

5.86

-9.915

Third Quartile

18.71

15.25

7.7

Relative to the mean, the growth fund has much more variation than the blend or value funds. The CV for blend is 73.11%, while the CV for growth and value is –623.43% and 28.15%, respectively. Descriptive Summary for 5-Yr Return Percentage

Copyright ©2024 Pearson Education, Inc.


Value

Blend

Growth

Mean

9.46

11.43

16.09

Median

9.41

11.61

16.09

Mode

8

9.47

15.1

Minimum

3.15

-4.17

0.42

Maximum

16.04

18.87

36.54

Range

12.89

23.04

36.12

Variance

4.8062

10.8459

16.0104

Standard Deviation

2.1923

3.2933

4.0013

Coeff. of Variation

23.18%

28.81%

24.86%

Skewness

0.2396

-0.8801

0.4714

Kurtosis

0.2366

2.1827

4.0251

Count

326

368

413

Standard Error

0.1214

0.1717

0.1969

First Quartile

7.9

9.16

13.615

Third Quartile

10.85

14.16

18.18

Relative to the mean, the growth fund has much more variation than the blend or value funds. The CV for blend is 28.81%, while the CV for value and growth is 23.18% and 24.86%, respectively.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cclxix 1. Descriptive Summary for 10-Yr Return Percentage cont. Value Blend

Growth

Mean

10.89

11.82

13.75

Median

10.87

12.07

13.80

Mode

11.16

11.21

14.11

Minimum

5.51

3.87

5.02

Maximum

16.86

15.59

24.83

Range

11.35

11.72

19.81

Variance

2.1725

4.3122

5.0269

Standard Deviation

1.4739

2.0766

2.2421

Coeff. of Variation

13.53%

17.56%

16.30%

Skewness

0.1148

-0.8496

0.2490

Kurtosis

1.9090

0.8581

3.0728

Count

326

368

413

Standard Error

0.0816

0.1082

0.1103

First Quartile

10.07

10.74

12.315

Third Quartile

11.7

13.46

14.95

Relative to the mean, the growth fund has much more variation than the blend or value funds. The CV for blend is 17.56%, while the CV for value and growth is 13.53% and 16.30%, respectively. 4.

For Low risk funds only. 1YR Low risk Value

Blend

Growth

Mean

16.41

11.41

0.17

Median

16.73

11.67

3.06

Mode

18.45

13.84

9.53

Minimum

2.38

-9.42

-46.79

Copyright ©2024 Pearson Education, Inc.


Maximum

31.16

40.49

17.41

Range

28.78

49.91

64.2

Variance

22.1795 60.1060

156.8410

Standard Deviation

4.7095

7.7528

12.5236

Coeff. of Variation

28.69%

67.96% 7496.53%

Skewness

-0.0084

0.6161

-1.1566

Kurtosis

1.1931

3.3840

1.4601

93

96

119

0.4884

0.7913

1.1480

Count Standard Error

For low-risk funds only, for the one-year fund, relative to the mean, the growth fund has much more variation than the blend or value funds. The CV for growth is 7,496.53%, while the CV for value and blend is 28.69% and 67.96%, respectively.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cclxxi 4. For Low risk funds only. cont. 5YR Low risk Value

Blend

Growth

Mean

9.66

11.47

16.77

Median

9.47

11.59

16.84

Mode

11.35

9.47

17.01

Minimum

3.30

-3.46

8.16

Maximum

15.03

18.87

36.54

Range

11.73

22.33

28.38

Variance

4.6819 12.6689 18.0637

Standard Deviation

2.1638

3.5593

4.2501

Coeff. of Variation

22.40%

31.04%

25.35%

Skewness

0.1174

-1.4485

1.1005

Kurtosis

0.4531

4.6346

3.9165

93

96

119

0.2244

0.3633

0.3896

Count Standard Error

For low-risk funds only, for the five-year fund, relative to the mean, the growth fund has much more variation than the blend or value funds. The CV for blend is 31.04%, while the CV for value and growth is 22.40% and 25.35%, respectively. 10YR Low risk Value

Blend

Growth

Mean

10.93

11.89

14.22

Median

10.94

12.34

14.13

Mode

11.5

10.26

11.72

Minimum

5.51

4.65

8.24

Maximum

15.65

15.19

24.83

Copyright ©2024 Pearson Education, Inc.


Range

10.14

10.54

16.59

Variance

2.2753

4.1087

5.9623

Standard Deviation

1.5084

2.0270

2.4418

Coeff. of Variation

13.81%

17.06%

17.17%

Skewness

-0.1621

-1.3353

0.5057

Kurtosis

2.1607

2.6052

2.3954

93

96

119

0.1564

0.2069

0.2238

Count Standard Error

For low-risk funds only, for the ten-year fund, relative to the mean, the growth fund has much more variation than the blend or value funds. The CV for growth is 17.17%, while the CV for value and blend is 13.81% and 17.06%, respectively.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cclxxiii

Chapter 4

1.

Market cap by Type: Count of Market Cap

Type

Market Cap

Growth

Value

Grand Total

Small

92

59

151

Mid-Cap

102

59

161

Large

219

208

427

Grand Total

413

326

739

Market cap by Risk: Count of Market Cap

Risk Level

Market Cap

Low

Average High Grand Total

Small

29

43

79

151

Mid-Cap

45

71

45

161

Large

138

192

97

427

Grand Total

212

306

221

739

Market cap by Rating: Count of Market Cap

Star Rating

Market Cap

One

Two Three Four Five Grand Total

Small

7

32

69

27

16

151

Mid-Cap

12

43

56

41

9

161

Large

23

112

166

86

40

427

Grand Total

42

187

291

154

65

739

Type by Risk: Copyright ©2024 Pearson Education, Inc.


Count of Type

Risk Level

Type

Low

Average High

Grand Total

Growth

119

173

121

413

Value

93

133

100

326

Grand Total

212

306

221

739

Type by Rating: Count of Type

Star Rating

Type

One

Two

Three

Four Five Grand Total

Growth

23

103

167

87

33

413

Value

19

84

124

67

32

326

Grand Total

42

187

291

154

65

739

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cclxxv 1. Risk and Rating: cont. Count of Risk Level Risk

2.

Star Rating One

Two Three Four Five Grand Total

Low

11

46

92

43

20

212

Average

16

69

126

67

28

306

High

15

72

73

44

17

221

Grand Total

42

187

291

154

65

739

Market cap by Type: Marginal probabilities table: Joint and Marginal Probabilities P(Market Cap, Type)

Type

Market Cap

Growth

Value

P(Type)

Small

0.1245

0.0798

0.2043

Mid-Cap

0.1380

0.0798

0.2179

Large

0.2963

0.2815

0.5778

P(Market Cap)

0.5589

0.4411

1.0000

Conditional probabilities: Conditional Probabilities P(Type|Market Cap)

Type

Market Cap

Growth

Value

Small

0.6093

0.3907

Mid-Cap

0.6335

0.3665

Large

0.5129

0.4871

Conditional Probabilities

Copyright ©2024 Pearson Education, Inc.


P(Market Cap|Type)

Type

Market Cap

Growth

Value

Small

0.2228

0.1810

Mid-Cap

0.2470

0.1810

Large

0.5303

0.6380

Market cap by Risk: Marginal probabilities table: Joint and Marginal Probabilities P(Market Cap, Risk)

Risk Level

Market Cap

Low

Average

High

P(Risk)

Small

0.0392

0.0582

0.1069

0.2043

Mid-Cap

0.0609

0.0961

0.0609

0.2179

Large

0.1867

0.2598

0.1313

0.5778

P(Market Cap)

0.2869

0.4141

0.2991

1.0000

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cclxxvii 2. Conditional probabilities: cont. Conditional Probabilities P(Risk|Market Cap)

Risk Level

Market Cap

Low

Average

High

Small

0.1921

0.2848

0.5232

Mid-Cap

0.2795

0.4410

0.2795

Large

0.3232

0.4496

0.2272

Conditional probabilities: Conditional Probabilities P(Market Cap|Risk)

Risk Level

Market Cap

Low

Average

High

Small

0.1368

0.1405

0.3575

Mid-Cap

0.2123

0.2320

0.2036

Large

0.6509

0.6275

0.4389

Market cap by Rating: Marginal probabilities table: Joint and Marginal Probabilities P(Market Cap, Rating)

Rating

Market Cap

One

Two

Three

Four

Five

P(Rating)

Small

0.0095

0.0433

0.0934 0.0365 0.0217

0.2043

Mid-Cap

0.0162

0.0582

0.0758 0.0555 0.0122

0.2179

Large

0.0311

0.1516

0.2246 0.1164 0.0541

0.5778

P(Market Cap)

0.0568

0.2530

0.3938 0.2084 0.0880

1.0000

Conditional probabilities: Copyright ©2024 Pearson Education, Inc.


Conditional P(Rating|Market Cap)

Rating

Market Cap

One

Probabilities

Two

Three

Four

Five

Small

0.0464

0.2119

0.4570

0.1788

0.1060

Mid-Cap

0.0745

0.2671

0.3478

0.2547

0.0559

Large

0.0539

0.2623

0.3888

0.2014

0.0937

Conditional

Probabilities

Conditional probabilities:

P(Market Cap|Rating)

Rating

Market Cap

One

Two

Three

Four

Five

Small

0.1667

0.1711

0.2371

0.1753

0.2371

Mid-Cap

0.2857

0.2299

0.1924

0.2662

0.1924

Large

0.5476

0.5989

0.5704

0.5584

0.5704

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cclxxix 2. Type by Risk: cont. Marginal probabilities table: Joint and Marginal Probabilities P(Type, Risk)

Risk Level

Type

Low

Average

High

P(Risk)

Growth

0.1610

0.2341

0.1637

0.5589

Value

0.1258

0.1800

0.1353

0.4411

P(Type)

0.2869

0.4141

0.2991

1.0000

Conditional probabilities: Conditional Probabilities P(Risk|Type)

Risk Level

Type

Low

Average

High

Growth

0.2881

0.4189

0.2930

Value

0.2853

0.4080

0.3067

Conditional probabilities: Conditional Probabilities P(Type|Risk)

Risk Level

Type

Low

Average

High

Growth

0.5613

0.5654

0.5475

Value

0.4387

0.4346

0.4525

Type by Rating: Marginal probabilities table: Joint and Marginal Probabilities P(Type, Rating)

Rating

Type

One

Two

Three

Four

Five

Copyright ©2024 Pearson Education, Inc.

P(Rating)


Growth

0.0311

0.1394

0.2260 0.1177 0.0447

0.5589

Value

0.0257

0.1137

0.1678 0.0907 0.0433

0.4411

P(Type)

0.0568

0.2530

0.3938 0.2084 0.0880

1.0000

Conditional probabilities: Conditional

Probabilities

P(Rating|Type) Rating Type

One

Two

Three

Four

Five

Growth

0.0557

0.2494 0.4044 0.2107 0.0799

Value

0.0583

0.2577 0.3804 0.2055 0.0982

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cclxxxi 2. Conditional probabilities: cont. Conditional

Probabilities

P(Type|Rating) Rating Type

One

Two

Three

Four

Five

Growth

0.5476

0.5508 0.5739 0.5649 0.5739

Value

0.4524

0.4492 0.4261 0.4351 0.4261

Risk by Rating: Marginal probabilities table: Joint and Marginal Probabilities P(Risk, Rating)

Rating

Risk Level

One

Two

Three

Four

Five

P(Rating)

Low

0.0149

0.0622

0.1245

0.0582

0.0271

0.2869

Average

0.0217

0.0934

0.1705

0.0907

0.0379

0.4141

High

0.0203

0.0974

0.0988

0.0595

0.0230

0.2991

P(Risk)

0.0568

0.2530

0.3938

0.2084

0.0880

1.0000

Conditional probabilities: Conditional P(Rating|Risk)

Rating

Risk Level

One

Probabilities

Two

Three

Four

Five

Low

0.0519

0.2170

0.4340

0.2028

0.0943

Average

0.0523

0.2255

0.4118

0.2190

0.0915

High

0.0679

0.3258

0.3303

0.1991

0.0769

Conditional probabilities conditioned on Risk: Conditional

Probabilities

Copyright ©2024 Pearson Education, Inc.


P(Risk|Rating)

Rating

Risk Level

One

Two

Three

Four

Five

Low

0.2619

0.2460

0.3162

0.2792

0.3162

Average

0.3810

0.3690

0.4330

0.4351

0.4330

High

0.3571

0.3850

0.2509

0.2857

0.2509

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cclxxxiii

Chapter 6

3 Yr, 5 Yr, 10 Yr

Copyright ©2024 Pearson Education, Inc.


3 Yr, 5 Yr, 10 Yr cont.

According to the boxplots and normal probability plots, the 3-year and 10-year return % are quite normally distributed while the 5-year return % is slightly right-skewed.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cclxxxv 3-Yr Return% by Type

Copyright ©2024 Pearson Education, Inc.


According to the boxplots and normal probability plots, the 3-year return % for both growth and value funds is slightly right-skewed.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cclxxxvii 3-year Return % by Market cap:

Copyright ©2024 Pearson Education, Inc.


3-year Return % by Market cap: cont.

According to the boxplots and normal probability plots, the 3-year return % for small is right skewed, while the mid-cap and large funds are both approximately normal.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cclxxxix 5-Yr Return% by Type

Copyright ©2024 Pearson Education, Inc.


According to the boxplots and normal probability plots, the 5-year return % for growth is right-skewed and value funds is approximately normal.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccxci 5-year Return % by Market cap:

Copyright ©2024 Pearson Education, Inc.


Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccxciii 5-year Return % by Market cap: cont.

According to the boxplots and normal probability plots, the 3-year return % for small, mid-cap and large funds is are right-skewed.

Copyright ©2024 Pearson Education, Inc.


10-Yr Return% by Type

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccxcv According to the boxplots and normal probability plots, the 5-year return % for growth funds is leftskewed while that of the value funds is roughly normally distributed.

Copyright ©2024 Pearson Education, Inc.


10-year Return % by Market cap:

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccxcvii

Copyright ©2024 Pearson Education, Inc.


10-year Return % by Market cap: cont.

According to the boxplots and normal probability plots, the 10-year return % for small-cap is quite normally distributed but left-skewed for the midcap and large-cap funds.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccxcix

Chapter 8

95% confidence interval 3-year return % Growth: 16.27    17.25 Value:12.26    12.88 Since the 95% confidence intervals do not overlap each other, the mean 3-year return % between the growth and value funds is significantly different from each other at the 95% level of confidence. 5-year return% Growth: 15.71    16.48 Value:9.22    9.70 Since the 95% confidence intervals do not overlap each other, the mean 5-year return % between the growth and value funds is significantly different from each other at the 95% level of confidence. 10-year return% Growth: 13.54    13.97 Value:10.73    11.05 Since the 95% confidence intervals do not overlap each other, the mean 10-year return % between the growth and value funds is significantly different from each other at the 95% level of confidence. 3-year return % Small: 12.58    13.99 Mid-Cap:13.52    14.66 Large: 15.32    16.27 Since the 95% confidence interval for large-cap funds does not overlap and is above the highest limit of that of the midcap and small-cap funds, the mean 3-year return % of the large-cap is significantly higher than that of the midcap and small-cap funds at the 95% level of confidence. Since the 95% confidence interval for midcap and small-cap overlap each other, the mean 3-year return % of the mid- and small-cap funds are not significantly different from each other at the 95% level of confidence. 5-year return% Small: 11.22    12.66 Mid-Cap:11.84    13.12 Large: 13.40    14.32 Since the 95% confidence interval for large-cap funds does not overlap and is above the highest limit of that of the midcap and small-cap funds, the mean 5-year return % of the large-cap is significantly higher than that of the midcap and small-cap funds at the 95% level of confidence. Since the 95% confidence interval for midcap and small-cap overlap each other, the mean 5-year return % of the mid- and small-cap funds are not significantly different from each other at the 95% level of confidence. Copyright ©2024 Pearson Education, Inc.


Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccci 10-year return% Small: 11.23    11.95 Mid-Cap:11.74    12.26 Large: 12.75    13.24 Since the 95% confidence interval for large-cap funds does not overlap and is above the highest limit of that of the midcap and small-cap funds, the mean 10-year return % of the large-cap is significantly higher than that of the midcap and small-cap funds at the 95% level of confidence. Since the 95% confidence interval for midcap and small-cap overlap each other, the mean 10-year return % of the mid- and small-cap funds are not significantly different from each other at the 95% level of confidence.

Copyright ©2024 Pearson Education, Inc.


Chapter 10

Year-to-date Return %: Population 1 = growth (413), 2 = value (326) H0: 12   22 The population variances are the same. H1: 12   22 The population variances are different. PHstat output: F Test for Differences in Two Variances

Data Level of Significance

0.05

Larger-Variance Sample Sample Size

413

Sample Variance

21.8834059

Smaller-Variance Sample Sample Size

326

Sample Variance

9.904962372

Intermediate Calculations F Test Statistic

2.2093

Population 1 Sample Degrees of Freedom

412

Population 2 Sample Degrees of Freedom

325

Two-Tail Test Upper Critical Value

1.2303

p-Value

0.0000 Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccciii Reject the null hypothesis S12 = 2.2093 S22 Decision: Since FSTAT = 2.2093 > 1.2303 and the p-value = 0.0000< 0.05, reject H0. There is enough evidence to conclude that the two population variances are different. Hence, the appropriate test for the difference in two means is the separate-variance t test.

Decision rule: If FSTAT > 1.2303, reject H0.Test statistic: FSTAT 

Copyright ©2024 Pearson Education, Inc.


Year-to-date Return %: cont. Population 1 = growth (413), 2 = value (326) H0: 1  2 H1: 1  2 PHStat output: Separate-Variances t Test for the Difference Between Two Means (assumes unequal population variances) Data Hypothesized Difference

0

Level of Significance

0.05

Population 1 Sample Sample Size

413

Sample Mean

-11.98745763

Sample Standard Deviation

4.6780

Population 2 Sample Sample Size

326

Sample Mean

-0.48309816

Sample Standard Deviation

3.1472

Intermediate Calculations Numerator of Degrees of Freedom

0.0070

Denominator of Degrees of Freedom

0.0000

Total Degrees of Freedom

719.8936

Degrees of Freedom

719

Standard Error

0.2887

Difference in Sample Means

-11.5044

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccv Separate-Variance t Test Statistic

-39.8436

Two-Tail Test Lower Critical Value

-1.9633

Upper Critical Value

1.9633

p-Value

0.0000 Reject the null hypothesis

Decision: Since tSTAT = –39.8436 < –1.9633 and the p-value = 0.0000 < 0.05, reject H0. There is sufficient evidence to conclude that the mean year-to-date return percentage is different between growth and value funds.

Copyright ©2024 Pearson Education, Inc.


5-year Return %: Population 1 = growth (413), 2 = value (326) H0: 12   22 The population variances are the same. H1: 12   22 The population variances are different. PHstat output: F Test for Differences in Two Variances

Data Level of Significance

0.05

Larger-Variance Sample Sample Size

413

Sample Variance

16.0104397

Smaller-Variance Sample Sample Size

326

Sample Variance

4.806191672

Intermediate Calculations F Test Statistic

3.3312

Population 1 Sample Degrees of Freedom

412

Population 2 Sample Degrees of Freedom

325

Two-Tail Test Upper Critical Value

1.2303

p-Value

0.0000 Reject the null hypothesis

Decision rule: If FSTAT > 1.2303, reject H0.Test statistic: FSTAT 

S12 = 3.3312 S22

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccvii Decision: Since FSTAT = 3.3312 > 1.2303 and the p-value = 0.0000 < 0.05, reject H0. There is enough evidence to conclude that the two population variances are different. Hence, the appropriate test for the difference in two means is the separate-variance t test.

Copyright ©2024 Pearson Education, Inc.


5-year Return %: cont. Population 1 = growth (413), 2 = value (326) H0: 1  2 H1: 1  2 PHStat output: Separate-Variances t Test for the Difference Between Two Means (assumes unequal population variances) Data Hypothesized Difference

0

Level of Significance

0.05

Population 1 Sample Sample Size

413

Sample Mean

16.09435835

Sample Standard Deviation

4.0013

Population 2 Sample Sample Size

326

Sample Mean

9.45696319

Sample Standard Deviation

2.1923

Intermediate Calculations Numerator of Degrees of Freedom

0.0029

Denominator of Degrees of Freedom

0.0000

Total Degrees of Freedom Degrees of Freedom

663.3369 663

Standard Error

0.2313

Difference in Sample Means

6.6374

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccix Separate-Variance t Test Statistic

28.6935

Two-Tail Test Lower Critical Value

-1.9635

Upper Critical Value

1.9635

p-Value

0.0000 Reject the null hypothesis

Decision: Since tSTAT = 28.6935 > 1.9635 and the p-value = 0.0000 < 0.05, reject H0. There is sufficient evidence to conclude that the mean five-year return percentage is different between growth and value funds.

Copyright ©2024 Pearson Education, Inc.


10-year Return %: Population 1 = growth (413), 2 = value (326) H0: 12   22 The population variances are the same. H1: 12   22 The population variances are different. PHstat output: F Test for Differences in Two Variances

Data Level of Significance

0.05

Larger-Variance Sample Sample Size

413

Sample Variance

5.026899222

Smaller-Variance Sample Sample Size

326

Sample Variance

2.172481672

Intermediate Calculations F Test Statistic

2.3139

Population 1 Sample Degrees of Freedom

412

Population 2 Sample Degrees of Freedom

325

Two-Tail Test Upper Critical Value

1.2303

p-Value

0.0000 Reject the null hypothesis

Decision rule: If FSTAT > 1.2303, reject H0.Test statistic: FSTAT 

S12 = 2.3139 S22

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccxi Decision: Since FSTAT = 2.3139 > 1.2303 and the p-value = 0.0000 < 0.05, reject H0. There is enough evidence to conclude that the two population variances are different. Hence, the appropriate test for the difference in two means is the separate-variance t test.

Copyright ©2024 Pearson Education, Inc.


10-year Return %: cont. Population 1 = growth (413), 2 = value (326) H0: 1  2 H1: 1  2 PHStat output: Separate-Variances t Test for the Difference Between Two Means (assumes unequal population variances) Data Hypothesized Difference

0

Level of Significance

0.05

Population 1 Sample Sample Size

413

Sample Mean

13.75256659

Sample Standard Deviation

2.2421

Population 2 Sample Sample Size

326

Sample Mean

10.89196319

Sample Standard Deviation

1.4739

Intermediate Calculations Numerator of Degrees of Freedom

0.0004

Denominator of Degrees of Freedom

0.0000

Total Degrees of Freedom

714.9580

Degrees of Freedom

714

Standard Error

0.1372

Difference in Sample Means

2.8606

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccxiii Separate-Variance t Test Statistic

20.8433

Two-Tail Test Lower Critical Value

-1.9633

Upper Critical Value

1.9633

p-Value

0.0000 Reject the null hypothesis

Decision: Since tSTAT = 20.8433 > 1.9633 and the p-value = 0.0000 < 0.05, reject H0. There is sufficient evidence to conclude that the mean ten-year return percentage is different between growth and value funds.

Copyright ©2024 Pearson Education, Inc.


Chapter 11

Year to date return percentages, based on the fund market caps H0: 12   22   32 H1: At least one variance is different. Levene’s test for equality of variances: SUMMARY Groups

Count

Sum

Average

Variance

Large

427 2489.61 5.830468384

13.1594

Mid-Cap

161

969.67 6.022795031

14.6381

Small

151

866.84 5.740662252

15.5679

ANOVA Source of Variation

SS

df

MS

Between Groups

6.7633

2

3.3816

Within Groups

10283.1965

736

13.9717

Total

10289.9598

738

F 0.2420

P-value

F crit

0.7851 3.0080

Level of significance

0.05

Since p-value = 0.7851 > 0.05, do not reject H0. There is not enough evidence that the equal variance assumption needed for the one-way ANOVA F test for the difference in the means is violated. Hence, the one-way ANOVA F test for the difference in the mean can be used. H0: 1  2  3 H1: At least one mean is different. Decision rule: df: 2,736. If F > 3.0080, reject H0. One-way ANOVA F test for difference in the means: ANOVA: Single Factor SUMMARY Groups Large

Count 427

Sum

Average

Variance

-2523.04

-5.908758782

47.2274

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccxv Mid-Cap

161

-1323.5

-8.220496894

50.4465

Small

151

-1261.77

-8.356092715

47.5818

df

MS

F

ANOVA Source of Variation

SS

Between Groups

1020.3267

2

510.1634

Within Groups

35327.5746

736

47.9994

Total

36347.9013

738

10.6285

P-value

F crit

0.0000 3.0080

Level of significance

0.05

Test statistic: FSTAT = 10.6285 Since p-value = 0.0000 < 0.05, and FSTAT = 10.6285 > 3.0080, reject H0. There is enough evidence to conclude that there is a significant difference in year-to-date return percentages across the funds (Small, Mid-cap, Large).

Copyright ©2024 Pearson Education, Inc.


Year to date return percentages, based on the fund market caps cont.

From the Tukey Pairwise Comparison procedure, there is a difference in year-to-date return percentages between the funds of Large and Mid-Cap, and Large and Small. There is no difference between Mid-Cap and Small.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccxvii Five-year return percentages, based on the fund market caps H0: 12   22   32 H1: At least one variance is different. Levene’s test for equality of variances: SUMMARY Groups

Count

Sum

Average

Variance

Large

427

1730.4 4.052459016

7.0773

Mid-Cap

161

534.01 3.316832298

5.8330

Small

151

542.29 3.591324503

7.5656

ANOVA Source of Variation

SS

df

MS

Between Groups

71.3730

2

35.6865

Within Groups

5083.0683

736

6.9063

Total

5154.4414

738

F 5.1672

P-value

F crit

0.0059 3.0080

Level of significance

0.05

Since p-value = 0.0059 < 0.05, reject H0. There enough evidence to conclude that the variances in five-year return percentages across the funds (Small, Mid-Cap, Large) are different. H0: 1  2  3 H1: At least one mean is different. Decision rule: df: 2,736. If F > 3.0080, reject H0. One-way ANOVA F test for difference in the means: ANOVA: Single Factor SUMMARY Groups

Count

Sum

Average

Variance

Large

427

5918.18

13.85990632

23.3936

Mid-Cap

161

2008.76

12.47677019

16.8730

Small

151

1803

11.94039735

20.0694

ANOVA

Copyright ©2024 Pearson Education, Inc.


Source of Variation

SS

df

MS

Between Groups

508.9014

2

254.4507

Within Groups

15675.7707

736

21.2986

Total

16184.6721

738

F 11.9468

P-value

F crit

0.0000 3.0080

Level of significance

0.05

Test statistic: FSTAT = 11.9468 Since p-value = 0.0000 < 0.05, and FSTAT = 11.9468 > 3.0080, reject H0. There is enough evidence to conclude that there is a significant difference in five-year return percentages across the funds (Small, Mid-cap, Large).

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccxix Five-year return percentages, based on the fund market caps cont.

From the Tukey Pairwise Comparison procedure, there is a difference in five-year return percentages between the funds of Large and Mid-Cap, and Large and Small. There is no difference between Mid-Cap and Small.

Copyright ©2024 Pearson Education, Inc.


Ten-year return percentages, based on the fund market caps H0: 12   22   32 H1: At least one variance is different. Levene’s test for equality of variances: SUMMARY Groups

Count

Sum

Average

Variance

Large

427

882.24 2.066135831

2.3130

Mid-Cap

161

216.92 1.347329193

0.9866

Small

151

264.28 1.750198675

1.9114

ANOVA Source of Variation

SS

df

MS

Between Groups

62.1137

2

31.0569

Within Groups

1429.8890

736

1.9428

Total

1492.0027

738

F 15.9857

P-value

F crit

0.0000 3.0080

Level of significance

0.05

Since p-value = 0.0000 < 0.05, reject H0. There enough evidence to conclude that the variances in ten-year return percentages across the funds (Small, Mid-Cap, Large) are different. H0: 1  2  3 H1: At least one mean is different. Decision rule: df: 2,736. If F > 3.0080, reject H0. One-way ANOVA F test for difference in the means: ANOVA: Single Factor SUMMARY Groups

Count

Sum

Average

Variance

Large

427

5549.14

12.99564403

6.5676

Mid-Cap

161

1931.98

11.99987578

2.8116

Small

151

1749.47

11.58589404

4.9938

ANOVA

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccxxi Source of Variation

SS

df

MS

Between Groups

271.2775

2

135.6388

Within Groups

3996.7271

736

5.4303

Total

4268.0047

738

F 24.9780

P-value

F crit

0.0000 3.0080

Level of significance

0.05

Test statistic: FSTAT = 24.9780 Since p-value = 0.0000 < 0.05, and FSTAT = 24.9780 > 3.0080, reject H0. There is enough evidence to conclude that there is a significant difference in ten-year return percentages across the funds (Small, Mid-cap, Large).

Copyright ©2024 Pearson Education, Inc.


Ten-year return percentages, based on the fund market caps cont.

From the Tukey Pairwise Comparison procedure, there is a difference in ten-year return percentages between the funds of Large and Mid-Cap, and Large and Small.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccxxiii

Chapter 12

1.

Year-to-date Return%: Population 1 = Blend, 2 = Growth, 3 = Value H0: M1  M 2  M 3 H1: Not all M j s are the same. Kruskal-Wallis Rank Test for Differences in Medians

Data

Level of Significance

0.05

Intermediate Calculations Sum of Squared Ranks/Sample Size

4.19E+08

Sum of Sample Sizes

1107

Number of Groups

3

Group

Sample Sum of Size Ranks

Mean Ranks

Blend

368

222092.5 603.512228

Growth

413

99223.5 240.250605

Value

326

291962 895.588957

Test Result H Test Statistic

778.7265

Critical Value

5.9915

p-Value

0.0000

Reject the null hypothesis

Because H = 778.7265 > 5.9915 or p-value = 0.0000, reject H0. At the 0.05 significance level, there is evidence of a difference in median year-to-date returns among the fund types (blend, growth, value). Copyright ©2024 Pearson Education, Inc.


1. Five-year Return%:Population 1 = Blend, 2 = Growth, 3 = Value cont. H0: M1  M 2  M 3 H1: Not all M j s are the same. Kruskal-Wallis Rank Test for Differences in Medians Data

Level of Significance

0.05

Intermediate Calculations Sum of Squared Ranks/Sample Size

3.93E+08

Sum of Sample Sizes

1107

Number of Groups

3

Group

Sample Sum of Size Ranks

Mean Ranks

Blend

368

176964.5 480.881793

Growth

413

339124 821.123487

Value

326

97189.5 298.127301

Test Result H Test Statistic

516.3777

Critical Value

5.9915

p-Value

0.0000

Reject the null hypothesis Because H = 516.3777 > 5.9915 or p-value = 0.0000, reject H0. At the 0.05 significance level, there is evidence of a difference in median five-year returns among the fund types (blend, growth, value). Ten-year Return%:Population 1 = Blend, 2 = Growth, 3 = Value H0: M1  M 2  M 3 H1: Not all M j s are the same. Kruskal-Wallis Rank Test for Differences in Medians Data

Level of Significance

0.05

Intermediate Calculations

Group

Sample Sum of Size Ranks

Mean Ranks

Blend

368

188260.5 511.577446

Growth

413

316103 765.382567

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccxxv Sum of Squared Ranks/Sample Size

3.75E+08

Sum of Sample Sizes

1107

Number of Groups

3

Value

326

108914.5 334.093558

Test Result H Test Statistic

341.2596

Critical Value

5.9915

p-Value

0.0000

Reject the null hypothesis Because H = 341.2596 > 5.9915 or p-value = 0.0000, reject H0. At the 0.05 significance level, there is evidence of a difference in median ten-year returns among the fund types (blend, growth, value).

Copyright ©2024 Pearson Education, Inc.


2.

Year-to-date Return%: Population 1 = Large, 2 = Mid-Cap, 3 = Small H0: M1  M 2  M 3 H1: Not all M j s are the same. Kruskal-Wallis Rank Test for Differences in Medians Data

Level of Significance

0.05

Intermediate Calculations Sum of Squared Ranks/Sample Size

3.42E+08

Sum of Sample Sizes

1107

Number of Groups

3

Group

Sample Sum of Size Ranks

Mean Ranks

Large

621

366578 590.302738

Mid-Cap

234

122198.5 522.215812

Small

252

124501.5 494.053571

Test Result H Test Statistic

19.1794

Critical Value

5.9915

p-Value

0.0001

Reject the null hypothesis Because H = 19.1794 > 5.9915 or p-value = 0.0001, reject H0. At the 0.05 significance level, there is evidence of a difference in median year-to-date returns among the market caps (small, mid-cap, large).

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccxxvii 2. cont. Five-year Return%:Population 1 = Large, 2 = Mid-Cap, 3 = Small H0: M1  M 2  M 3 H1: Not all M j s are the same. Kruskal-Wallis Rank Test for Differences in Medians Data

Level of Significance

0.05

Intermediate Calculations Sum of Squared Ranks/Sample Size

3.52E+08

Sum of Sample Sizes

1107

Number of Groups

3

Group

Sample Sum of Size Ranks

Mean Ranks

Large

621

400166 644.389694

Mid-Cap

234

113809.5 486.365385

Small

252

99302.5

394.05754

Test Result H Test Statistic

123.1813

Critical Value

5.9915

p-Value

0.0000

Reject the null hypothesis Because H = 123.1813 > 5.9915 or p-value = 0.0000, reject H0. At the 0.05 significance level, there is evidence of a difference in median five-year returns among the market caps (small, mid-cap, large).

Copyright ©2024 Pearson Education, Inc.


2. Ten-year Return%: cont. Population 1 = Large, 2 = Mid-Cap, 3 = Small H0: M1  M 2  M 3 H1: Not all M j s are the same. Kruskal-Wallis Rank Test for Differences in Medians

Data

Level of Significance

0.05

Intermediate Calculations Sum of Squared Ranks/Sample Size

3.54E+08

Sum of Sample Sizes

1107

Number of Groups

3

Group

Sample Sum of Size Ranks

Mean Ranks

Large

621

404495.5 651.361514

Mid-Cap

234

109863

Small

252

98919.5 392.537698

469.5

Test Result H Test Statistic

138.2124

Critical Value

5.9915

p-Value

0.0000

Reject the null hypothesis

Because H = 138.2124 > 5.9915 or p-value = 0.0000, reject H0. At the 0.05 significance level, there is evidence of a difference in median ten-year returns among the market caps (small, mid-cap, large).

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccxxix 3.

Risk based on market cap: H 0 : There is no relationship between risk and market cap H1 : There is relationship between risk and market cap

PHStat output of the chi-square test: Chi-Square Test

Observed Frequencies Risk Level Market Cap

Low

Average

High

Total

Mid-Cap

60

111

63

234

Small

53

79

120

252

Large

195

286

140

621

Total

308

476

323

1107

Expected Frequencies Risk Level Market Cap

Low

Mid-Cap

Average

High

Total

65.10569 100.6179 68.27642

234

Small 70.11382 108.3577 73.52846

252

Large

621

Total

172.7805 267.0244 181.1951 308

476

Data Level of Significance

0.05

Number of Rows

3

Number of Columns

3

Degrees of Freedom

4

Copyright ©2024 Pearson Education, Inc.

323

1107


Results Critical Value

9.487729

Chi-Square Test Statistic

56.95336

p-Value

1.27E-11

Reject the null hypothesis

Expected frequency assumption is met. Since p-value = 0.0000 < 0.05, reject H0. There is enough evidence that risk is related to market cap and, hence, a difference in risk based on market cap.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccxxxi 3. Rating based on market cap: cont. H 0 : There is no relationship between rating and market cap H1 : There is relationship between rating and market cap PHStat output of the chi-square test: Chi-Square Test

Observed Frequencies Star Rating Market Cap

One

Two

Three

Four

Five

Total

Mid-Cap

18

67

80

53

16

234

Small

14

65

101

52

20

252

Large

41

147

243

135

55

621

Total

73

279

424

240

91

1107

Expected Frequencies Star Rating Market Cap

One

Two

Three

Four

Five

Total

Mid-Cap 15.43089 58.97561 89.62602 50.73171 19.23577 Small 16.61789

234

63.5122 96.52033 54.63415 20.71545

252

Large 40.95122 156.5122 237.8537 134.6341 51.04878

621

Total

1107

73

279

424

Data Level of Significance

0.05

Number of Rows

3

Number of Columns

5

Copyright ©2024 Pearson Education, Inc.

240

91


Degrees of Freedom

8

Results Critical Value

15.50731

Chi-Square Test Statistic

5.002362

p-Value

0.757324

Do not reject the null hypothesis

Expected frequency assumption is met. Since p-value = 0.7573 > 0.05, do not reject H0. There is not enough evidence that rating is related to market cap and, hence, a difference in rating based on market cap.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccxxxiii 3. Risk based on type: cont. H 0 : There is no relationship between risk and type H1 : There is relationship between risk and type PHStat output of the chi-square test: Chi-Square Test

Observed Frequencies Risk Level Market Type

Low

Average

High

Total

Growth

119

173

121

413

Value

93

133

100

326

Blend

96

170

102

368

Total

308

476

323

1107

Expected Frequencies Risk Level Market Type

Low

Growth

Average

Total

120.505

413

Value

90.7028 140.1771 95.12014

326

Blend

102.3884 158.2367 107.3749

368

Total

114.9088 177.5863

High

308

476

323

Data Level of Significance

0.05

Number of Rows

3

Number of Columns

3

Copyright ©2024 Pearson Education, Inc.

1107


Degrees of Freedom

4

Results Critical Value

9.487729

Chi-Square Test Statistic

2.484273

p-Value

0.647454

Do not reject the null hypothesis

Expected frequency assumption is met. Since p-value = 0.6475 > 0.05, do not reject H0. There is not enough evidence that risk is related to type and, hence, a difference in risk based on type.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccxxxv 3. Rating based on type: cont. H 0 : There is no relationship between rating and type H1 : There is relationship between rating and type PHStat output of the chi-square test: Chi-Square Test

Observed Frequencies Star Rating Type

One

Two

Three

Four

Five

Total

Growth

23

103

167

87

33

413

Value

19

84

124

67

32

326

Blend

31

92

133

86

26

368

Total

73

279

424

240

91

1107

Expected Frequencies Star Rating Type

One Growth Value

Two

Three

27.23487 104.0894 158.1861 21.49774

73

82.1626 124.8636 70.67751 26.79855

326

279

79.7832 30.25113

424

0.05

Number of Rows

3

Number of Columns

5

Total 413

Data Level of Significance

Five

89.5393 33.95032

Blend 24.26739 92.74797 140.9503 Total

Four

Copyright ©2024 Pearson Education, Inc.

240

91

368 1107


Degrees of Freedom

8

Results Critical Value

15.50731

Chi-Square Test Statistic

6.201951

p-Value

0.624622

Do not reject the null hypothesis

Expected frequency assumption is met. Since p-value = 0.6246 > 0.05, do not reject H0. There is not enough evidence that rating is related to type and, hence, a difference in rating based on type. 4.

Refer to the conclusions of the various hypothesis tests in parts (1) to (3).

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccxxxvii

Chapter 15

3-year Return %: All hypothesis tests are performed at the 5% level of significance. Let X1 = Assets, X2 = Turnover Ratio, X3 = Expense Ratio, X4 = Type (1 for growth and 0 otherwise), X5 = Risk Level (1 for high and 0 otherwise) Coefficients

Standard Error

t Stat

P-value

R Square

VIF

Intercept

14.7528

0.5127 28.7734

0.0000

Assets

0.0000

0.0000

0.7769

0.4375

0.0605

1.0645

Turnover Ratio

-0.0073

0.0030

-2.4050

0.0164

0.0349

1.0362

Expense Ratio

-1.2833

0.3711

-3.4578

0.0006

0.0663

1.0710

Type

4.2002

0.3077 13.6507

0.0000

0.0057

1.0057

Risk Level

-1.1058

0.3368

0.0011

0.0239

1.0244

-3.2835

A regression analysis revealed that all five variables had VIF values below 5. So there is no reason to suspect collinearity between any pair of variables. A best subsets regression analysis reveals that a four-variable model with turnover ratio, expense ratio, type, and risk level has the lowest Cp value and the highest adjusted r2. The best-subset approach yielded: Cp = 4.6035 with X2X3X4X5 and adjusted r2 = 0.2282. (PHStat display of 4 smallest Cp) Model

Cp

k+1

R Square

Adj. R Square

Std. Error

X3X4X5

8.9597

4

0.2257

0.2225

4.1553

X1X3X4X5

9.7842

5

0.2269

0.2227

4.1548

X2X3X4X5

4.6035

5

0.2324

0.2282

4.1402

X1X2X3X4X5

6.0000

6

0.2330

0.2278

4.1413

Using Stepwise Regression Analysis in PHStat (partial display) reveals the best model Coefficients

Standard Error

t Stat

Copyright ©2024 Pearson Education, Inc.

P-value


Intercept

14.8718

0.4892

30.4008

0.0000

Type

4.2060

0.3075

13.6774

0.0000

Expense Ratio

-1.3371

0.3645

-3.6683

0.0003

Risk Level

-1.1209

0.3361

-3.3348

0.0009

Turnover Ratio

-0.0076

0.0030

-2.5218

0.0119

The most appropriate multiple regression model for predicting three-year return is: Yˆ  14.8718  0 X1  0.0076 X 2  1.3371X 3  4.2060 X 4  1.1209 X 5 , for X1 = Assets, X2 = Turnover Ratio, X3 = Expense Ratio, X4 = Type (1 for growth and 0 otherwise), X5 = Risk Level (1 for high and 0 otherwise). The adjusted r2 for the best model is 0.2282 and the r2 for the model is 0.2324, so the variation in three-year return can be explained by variation in turnover ratio, variation in expense ratio, variation in type, and variation in risk level. All the remaining independent variables have a p-value for the individual t test statistic < 0.05.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccxxxix 3-year Return %: The residual plots:

Copyright ©2024 Pearson Education, Inc.


3-year Return %: The residual plots:

The residual plots do not reveal any specific pattern.

Normal probability plot:

The normal probability plot indicates that the distribution of the residuals deviates slightly from the normal distribution in the tails. The best model for predicting the 3-year return % is Yˆ  14.8718  0 X1  0.0076 X 2  1.3371X 3  4.2060 X 4  1.1209 X 5 , for X1 = Assets, X2 = Turnover Ratio, X3 = Expense Ratio, X4 = Type (1 for growth and 0 otherwise), X5 = Risk Level (1 for high and 0 otherwise).

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccxli

5-year Return %: All hypothesis tests are performed at the 5% level of significance. Let X1 = Assets, X2 = Turnover Ratio, X3 = Expense Ratio, X4 = Type (1 for growth and 0 otherwise), X5 = Risk Level (1 for high and 0 otherwise) Coefficients

Standard Error

t Stat

P-value

R Square

VIF

Intercept

11.5498

0.3988 28.9636

0.0000

Assets

0.0000

0.0000

1.2028

0.2295

0.0605

1.0645

Turnover Ratio

-0.0044

0.0024

-1.8650

0.0626

0.0349

1.0362

Expense Ratio

-1.3985

0.2886

-4.8451

0.0000

0.0663

1.0710

Type

6.6679

0.2393 27.8633

0.0000

0.0057

1.0057

Risk Level

-0.9499

0.2619

0.0003

0.0239

1.0244

-3.6265

A regression analysis revealed that all five variables had VIF values below 5. So there is no reason to suspect collinearity between any pair of variables. A best subsets regression analysis reveals that a four-variable model with turnover ratio, expense ratio, type, and risk level has the lowest Cp value and the highest adjusted r2. The best-subset approach yielded: Cp = 5.4466 with X2X3X4X5 and adjusted r2 = 0.5267. (PHStat display of 4 smallest Cp) Model

Cp

k+1

R Square

Adj. R Square

Std. Error

X3X4X5

7.5684

4

0.5266

0.5246

3.2287

X1X3X4X5

7.4781

5

0.5279

0.5253

3.2263

X2X3X4X5

5.4466

5

0.5292

0.5267

3.2219

X1X2X3X4X5

6.0000

6

0.5302

0.5269

3.2209

Using Stepwise Regression Analysis in PHStat (partial display) reveals the best model Coefficients

Standard Error

t Stat

P-value

Intercept

11.6931

0.3807

30.7158

0.0000

Type

6.6749

0.2393

27.8924

0.0000

Copyright ©2024 Pearson Education, Inc.


Expense Ratio

-1.4633

0.2837

-5.1588

0.0000

Risk Level

-0.9681

0.2616

-3.7010

0.0002

Turnover Ratio

-0.0048

0.0023

-2.0296

0.0428

The most appropriate multiple regression model for predicting five-year return is: Yˆ  11.6931  0 X 1  0.0048 X 2  1.4633 X 3  6.6749 X 4  0.9681X 5 , for X1 = Assets, X2 = Turnover Ratio, X3 = Expense Ratio, X4 = Type (1 for growth and 0 otherwise), X5 = Risk Level (1 for high and 0 otherwise). The adjusted r2 for the best model is 0.5267 and the r2 for the model is 0.5292, so the variation in five-year return can be explained by variation in turnover ratio, variation in expense ratio, variation in type, and variation in risk level. All the remaining independent variables have a p-value for the individual t test statistic < 0.05.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccxliii 5-year Return %: The residual plots:

Copyright ©2024 Pearson Education, Inc.


5-year Return %: The residual plots:

The residual plots do not reveal any specific pattern.

Normal probability plot:

The normal probability plot indicates that the distribution of the residuals deviates slightly from the normal distribution in the tails. The best model for predicting the 5-year return % is Yˆ  11.6931  0 X 1  0.0048 X 2  1.4633 X 3  6.6749 X 4  0.9681X 5 , for X1 = Assets, X2 = Turnover Ratio, X3 = Expense Ratio, X4 = Type (1 for growth and 0 otherwise), X5 = Risk Level (1 for high and 0 otherwise).

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccxlv 10-year Return %: All hypothesis tests are performed at the 5% level of significance. Let X1 = Assets, X2 = Turnover Ratio, X3 = Expense Ratio, X4 = Type (1 for growth and 0 otherwise), X5 = Risk Level (1 for high and 0 otherwise) Coefficients

Standard Error

t Stat

P-value

R Square

VIF

Intercept

12.6075

0.2239 56.3114

0.0000

Assets

0.0000

0.0000

2.9701

0.0031

0.0605

1.0645

Turnover Ratio

-0.0035

0.0013

-2.6058

0.0094

0.0349

1.0362

Expense Ratio

-1.2513

0.1621

-7.7216

0.0000

0.0663

1.0710

Type

2.8917

0.1344 21.5223

0.0000

0.0057

1.0057

Risk Level

-0.4980

0.1471

0.0007

0.0239

1.0244

-3.3864

A regression analysis revealed that all five variables had VIF values below 5. So there is no reason to suspect collinearity between any pair of variables. A best subsets regression analysis reveals that a five-variable model with assets, turnover ratio, expense ratio, type, and risk level has the lowest Cp value and the highest adjusted r2. The best-subset approach yielded: Cp = 6.0000 with X1X2X3X4X5 and adjusted r2 = 0.4345. (PHStat display of 4 smallest Cp) Model

Cp

k+1

R Square

Adj. R Square

Std. Error

X1X2X3X4

15.4679

5

0.4296

0.4265

1.8212

X1X3X4X5

10.7903

5

0.4332

0.4301

1.8155

X2X3X4X5

12.8214

5

0.4316

0.4285

1.8180

X1X2X3X4X5

6.0000

6

0.4384

0.4345

1.8084

Using Stepwise Regression Analysis in PHStat (partial display) reveals the best model Coefficients

Standard Error

t Stat

P-value

Intercept

12.6075

0.2239

56.3114

0.0000

Type

2.8917

0.1344

21.5223

0.0000

Expense Ratio

-1.2513

0.1621

-7.7216

0.0000

Copyright ©2024 Pearson Education, Inc.


Assets

0.0000

0.0000

2.9701

0.0031

Risk Level

-0.4980

0.1471

-3.3864

0.0007

Turnover Ratio

-0.0035

0.0013

-2.6058

0.0094

The most appropriate multiple regression model for predicting ten-year return is: Yˆ  12.6075  0 X 1  0.0035 X 2  1.2513 X 3  2.8917 X 4  0.4980 X 5 , for X1 = Assets, X2 = Turnover Ratio, X3 = Expense Ratio, X4 = Type (1 for growth and 0 otherwise), X5 = Risk Level (1 for high and 0 otherwise). The adjusted r2 for the best model is 0.4345 and the r2 for the model is 0.4384, so the variation in three-year return can be explained by variation in assets, variation in turnover ratio, variation in expense ratio, variation in type, and variation in risk level. All the remaining independent variables have a p-value for the individual t test statistic < 0.05.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccxlvii 10-year Return %: The residual plots:

Copyright ©2024 Pearson Education, Inc.


10-year Return %: The residual plots:

The residual plots do not reveal any specific pattern.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccxlix 10-year Return %: Normal probability plot:

The normal probability plot indicates that the distribution of the residuals deviates slightly from the normal distribution in the tails. The best model for predicting the 10-year return % is Yˆ  12.6075  0 X 1  0.0035 X 2  1.2513 X 3  2.8917 X 4  0.4980 X 5 , for X1 = Assets, X2 = Turnover Ratio, X3 = Expense Ratio, X4 = Type (1 for growth and 0 otherwise), X5 = Risk Level (1 for high and 0 otherwise).

The Claro Mountain State Student Surveys Case

Chapter 1

1.

2.

Question 1: categorical, nominal; Question 2: categorical, nominal; Question 3: categorical, nominal; Question 4: numerical, discrete, interval; Question 5: categorical, nominal; Question 6: categorical, nominal; Question 7: numerical, discrete, interval; Question 8: numerical, discrete, ratio; Question 9: categorical, nominal; Question 10: categorical, nominal; Question 11: categorical, nominal; Question 12: categorical, nominal; Question 13: numerical, discrete, ratio; Question14: numerical, discrete, ratio; Question 15: numerical, discrete, interval; Question 16: numerical, discrete, ratio; Question 17: categorical, nominal; Question 18: numerical, discrete, interval; Question 19: numerical, discrete, interval; Question 20: numerical, discrete, interval; Question 21: numerical, discrete, interval; Question 22: numerical, discrete, interval; Question 23: categorical, nominal. ZIP or postal code, to be consistent to the data in the file, ―the first five characters of such a code with the third fourth, and fifth characters changed to X.‖ Annual household income might need more specification such as thousands of dollars. The free response gender might also be considered Copyright ©2024 Pearson Education, Inc.


3.

4. 5.

6. 7.

even though ―free response‖ is an acceptable definition (although one that may not be amenable to data analysis). Questions in which the domain is listed as a set of choices. Question 1 asks for full-time or parttime and Question 3 asks for transfer status of yes or no. Neither of these responses would need data wrangling. No. This is an open response question. Invite discussion of how gender can be represented. Interested students should be referred to websites such as https://williamsinstitute.law.uclea.edu/quick-facts/survey-measures/ An alternate survey question and response is: Your current gender identity: Man, Woman, Neither, Both, Other. In the data cleaning process, a response of ―10‖ for Questions 18, 19, or 20 could be changed to 7, but recoding the answer as missing is reasonable too. There are many errors. Look at the data for typographical errors, constancy among values for categorical variables, numerical values that are invalid or seem irregular, and non-numerical values for a numeric variable. For example, cell B21 ―PT‖ is an entry error, which is inconsistent to the coding of part-time, which should be ―P/T.‖ In cell D11, ―N‖ should be ―No‖, and cell D17, ―Y‖ should be ―Yes.‖ Other specific errors may include cells P16, N18, F19, R14, S3, H12, T4, and V9. The error in the Major column is most subtle. There is a certain amount of ambiguity because none of the columns come with a formal definition. That’s the reason operational definitions are needed. For example, is the postal code 60XXX an error (and should it be something such as 60601), or is it a valid value created to a less-specific identifier? One would not know unless one had the operational definition of postal code.

8.

For instructor’s use: Question 8 invites learners to reflect about Column K. Coding gender identity is an open question that is still being discussed and which raises some non-statistical concerns. There is no one correct way to code gender, but the student survey document uses one model approach described by the Williams Institute at UCLA. This question can be omitted without any loss of comprehension of chapter concepts or later learning. However, for those wanting to include some DEI-related content, this question opens the door to a broad discussion that might include how many categories could/should be defined and whether a category ―Other‖ would be an act of inclusion.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccli

Chapter 2

1.

Status:

Copyright ©2024 Pearson Education, Inc.


Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccliii 1. Class: cont.

Copyright ©2024 Pearson Education, Inc.


1. Transfer: cont.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccclv

1. Expected Year of Graduation: cont.

Copyright ©2024 Pearson Education, Inc.


1. Major: cont.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccclvii

Copyright ©2024 Pearson Education, Inc.


1. Grad School Intention: cont.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccclix 1. Age: cont.

Copyright ©2024 Pearson Education, Inc.


1. Height: cont.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccclxi 1. Assigned Sex: cont.

Copyright ©2024 Pearson Education, Inc.


1. Gender Identity: cont.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccclxiii 1. Postal Code: cont.

Copyright ©2024 Pearson Education, Inc.


Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccclxv 1. Employment: cont.

Copyright ©2024 Pearson Education, Inc.


1. Loans: cont.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccclxvii 1. GPA: cont.

Copyright ©2024 Pearson Education, Inc.


1. Current Credit Hours: cont.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccclxix 1. Course Materials: cont.

Copyright ©2024 Pearson Education, Inc.


1. Delivery Mode Preference: cont.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccclxxi

Copyright ©2024 Pearson Education, Inc.


1. Food-Dining Satisfaction: cont.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccclxxiii 1. Athletic Satisfaction: cont.

Copyright ©2024 Pearson Education, Inc.


1. Student Support Satisfaction: cont.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccclxxv 1. Development Center Visits: cont.

Copyright ©2024 Pearson Education, Inc.


1. Expected Starting Salary: cont.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccclxxvii 1. Recommended Course: cont.

Copyright ©2024 Pearson Education, Inc.


2.

About half of the students are full-time and half are part-time. Only 13% of the students are first-year, while 64% are upper level, and 23% are second-year students. Most of the students are not transfer students. Nearly 56% of students are expecting to graduation in 2026 or 2027. Not one major has more than 16% of the responses. There are more students with a grad intention than either of the other categories. Nearly 75% of the students are between 17 and 21 years old. The majority of students are between 64 and 68 inches tall. There are about the same number of females and males. Majority of the students are currently employed.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccclxxix

Chapter 3

Implicit in this assignment is the determination of which variables are numerical variables. In turn, that creates an opportunity to discuss an issue about summary statistics. The data set contains a set of ordinal-scaled attitudinal variables that measure satisfaction. In a strict sense, these are categorical variables with numeric values—and, therefore, would not part of the report. That said, some sources treat such variables as quasi-numerical and students will have seen many online review sites that report means of ordinal-scaled ratings, such as 4.3 whether the rating is for a particular, a hotel or even a professor’s performance. Using means for such data is questionable, but reporting the mode, the median, and the range would not be. This assignment can be extended by asking student to examine group data by calculating data by categorical data such as status. Explored that way, there are differences in means which could be further explored if hypothesis testing is taught. Descriptive Summary for Age, Height, Loans, GPA, Current Credit Hours, Course Materials Current Course Age Height Loans GPA Credit Hours Materials Mean

21.9304 69.7739 32.8583

3.3171

9.7739

157.9826

Median

22

70

33.3

3.45

12

142

Mode

21

69

37

3.55

12

289

Minimum

17

64

3.4

1.91

1

31

Maximum

30

76

59.7

4.53

18

291

Range

13

12

56.3

2.62

17

260

Variance

5.8548

8.1941 75.2518

0.3157

29.2291

5220.7716

Standard Deviation

2.4197

2.8625

8.6748

0.5619

5.4064

72.2549

Coeff. of Variation

11.03%

4.10%

26.40%

16.94%

55.31%

45.74%

Skewness

0.8503

-0.0286

-0.2290

-0.4800

-0.2737

0.3316

Kurtosis

1.1478

-0.6865

0.8657

-0.2074

-1.3071

-0.9278

Count

115

115

115

115

115

115

0.2256

0.2669

0.8089

0.0524

0.5041

6.7378

Standard Error

Copyright ©2024 Pearson Education, Inc.


First Quartile

20

67

27.3

2.96

5

107

Third Quartile

23

72

38.1

3.67

14

216

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccclxxxi Descriptive Summary for Food-Dining Satisfaction, Athletic Satisfaction, Student Support Satisfaction, cont. Devel Center Visits, Expected Starting Salary FoodStudent Expected Dining Athletic Support Devel Ctr Starting Satisfaction Satisfaction Satisfaction Visits Salary Mean

4.1565

4.5913

4.5478

3.1652

62.4957

Median

4

5

4

4

63

Mode

4

4

4

4

59

Minimum

1

1

1

0

12

Maximum

7

7

7

22

98

Range

6

6

6

22

86

Variance

2.7121

1.8227

2.2323

7.7005

358.1118

Standard Deviation

1.6469

1.3501

1.4941

2.7750

18.9238

Coeff. of Variation

39.62%

29.41%

32.85%

87.67%

30.28%

Skewness

-0.3270

-0.4802

-0.0058

2.7286

-0.3706

Kurtosis

-0.4448

0.1007

-0.0654

17.2127

-0.3672

115

115

115

115

115

Standard Error

0.1536

0.1259

0.1393

0.2588

1.7647

First Quartile

3

4

4

1

50

Third Quartile

5

6

6

5

77

Count

Copyright ©2024 Pearson Education, Inc.


Chapter 4

This case naturally follows the Chapter 3 case and authors recommend this case be assigned only if the Chapter 3 case was assigned first. There are no implicit concepts in this case, just practice in the computation and presentation of conditional and marginal probabilities. 1.

Student status and current class Count of Class

Status

Class

F/T

P/T

Grand Total

First-year

9

6

15

Second-year

8

18

26

Upper-level

41

33

74

Grand Total

58

57

115

Joint and Marginal Probabilities P(Class, Status)

Status

Class

F/T

P/T

First-year

0.0783

0.0522

0.1304

Second-year

0.0696

0.1565

0.2261

Upper-level

0.3565

0.2870

0.6435

P(Class)

0.5043

0.4957

1.0000

P(Status)

Conditional Probabilities P(Status|Class)

Status

Class

F/T

P/T

First-year

0.6000

0.4000

Second-year

0.3077

0.6923

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccclxxxiii Upper-level

0.5541

0.4459

Conditional Probabilities P(Class|Status)

Status

Class

F/T

P/T

First-year

0.1552

0.1053

Second-year

0.1379

0.3158

Upper-level

0.7069

0.5789

Copyright ©2024 Pearson Education, Inc.


1. Student status and graduate school intentions cont. Count of Grad School Intention Status Intention

F/T

P/T

Grand Total

No

9

5

14

Not sure

23

22

45

Yes

26

30

56

Grand Total

58

57

115

Joint and Marginal Probabilities P(Intention, Status)

Status

Intention

F/T

P/T

P(Status)

No

0.0783

0.0435

0.1217

Not sure

0.2000

0.1913

0.3913

Yes

0.2261

0.2609

0.4870

P(Intention)

0.5043

0.4957

1.0000

Conditional Probabilities P(Status|Intention) Status Intention

F/T

P/T

No

0.6429

0.3571

Not sure

0.5111

0.4889

Yes

0.4643

0.5357

Conditional Probabilities P(Intention|Status) Status

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccclxxxv Intention

F/T

P/T

No

0.1552

0.0877

Not sure

0.3966

0.3860

Yes

0.4483

0.5263

Copyright ©2024 Pearson Education, Inc.


1. Student status and employment status cont. Count of Employment Status Employment

F/T

P/T

Grand Total

F/T

13

7

20

Not

7

12

19

P/T

38

38

76

Grand Total

58

57

115

Joint and Marginal Probabilities P(Employment, Status)

Status

Employment

F/T

P/T

P(Status)

F/T

0.1130

0.0609

0.1739

Not

0.0609

0.1043

0.1652

P/T

0.3304

0.3304

0.6609

P(Employment)

0.5043

0.4957

1.0000

Conditional Probabilities P(Status|Employment) Status Employment

F/T

P/T

F/T

0.6500

0.3500

Not

0.3684

0.6316

P/T

0.5000

0.5000

Conditional Probabilities P(Employment|Status) Status Employment

F/T

P/T

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccclxxxvii F/T

0.2241

0.1228

Not

0.1207

0.2105

P/T

0.6552

0.6667

Copyright ©2024 Pearson Education, Inc.


1. Student status and preferred instructional delivery mode cont. Count of Delivery Mode Preference Status Preference

F/T

P/T

Grand Total

Hybrid

17

12

29

None

11

7

18

Online asynch

5

12

17

Physical

16

19

35

Virtual

9

7

16

Grand Total

58

57

115

Joint and Marginal Probabilities P(Preference, Status)

Status

Preference

F/T

P/T

P(Status)

Hybrid

0.1478

0.1043

0.2522

None

0.0957

0.0609

0.1565

Online asynch

0.0435

0.1043

0.1478

Physical

0.1391

0.1652

0.3043

Virtual

0.0783

0.0609

0.1391

P(Preference)

0.5043

0.4957

1.0000

Conditional Probabilities P(Status|Preference) Status Preference

F/T

P/T

Hybrid

0.5862

0.4138

None

0.6111

0.3889

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems ccclxxxix Online asynch

0.2941

0.7059

Physical

0.4571

0.5429

Virtual

0.5625

0.4375

Conditional Probabilities P(Preference|Status) Status Preference

F/T

P/T

Hybrid

0.2931

0.2105

None

0.1897

0.1228

Online asynch

0.0862

0.2105

Physical

0.2759

0.3333

Virtual

0.1552

0.1228

Copyright ©2024 Pearson Education, Inc.


1. Current class and graduate school intentions cont. Count of Grad School Intention Class Firstyear

Intention

Second- Upper- Grand year level Total

No

2

4

8

14

Not sure

7

10

28

45

Yes

6

12

38

56

Grand Total

15

26

74

115

Joint and Marginal Probabilities P(Intention, Class)

Class

Intention

Firstyear

No

0.0174

0.0348

0.0696

0.1217

Not sure

0.0609

0.0870

0.2435

0.3913

Yes

0.0522

0.1043

0.3304

0.4870

P(Intention)

0.1304

0.2261

0.6435

1.0000

Secondyear

Upperlevel

P(Class)

Conditional Probabilities P(Class|Intention)

Class

Intention

First-year

Second-year

Upper-level

No

0.1429

0.2857

0.5714

Not sure

0.1556

0.2222

0.6222

Yes

0.1071

0.2143

0.6786

Conditional Probabilities P(Intention|Class)

Class Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccxci Intention

First-year

Second-year

Upper-level

No

0.1333

0.1538

0.1081

Not sure

0.4667

0.3846

0.3784

Yes

0.4000

0.4615

0.5135

Copyright ©2024 Pearson Education, Inc.


1. Current class and employment status cont. Count of Employment Class Employment

First-year

Second-year

Upper-level

Grand Total

3

17

20

F/T Not

7

7

5

19

P/T

8

16

52

76

Grand Total

15

26

74

115

Joint and Marginal Probabilities P(Employment, Class)

Class

Employment

Firstyear

F/T

0.0000

0.0261

0.1478

0.1739

Not

0.0609

0.0609

0.0435

0.1652

P/T

0.0696

0.1391

0.4522

0.6609

P(Employment)

0.1304

0.2261

0.6435

1.0000

Secondyear

Upperlevel

P(Class)

Conditional Probabilities P(Class|Employment) Class Employment

First-year

Second-year

Upper-level

F/T

0.0000

0.1500

0.8500

Not

0.3684

0.3684

0.2632

P/T

0.1053

0.2105

0.6842

Conditional Probabilities P(Employment|Class) Class

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccxciii Employment

First-year

Second-year

Upper-level

F/T

0.0000

0.1154

0.2297

Not

0.4667

0.2692

0.0676

P/T

0.5333

0.6154

0.7027

Copyright ©2024 Pearson Education, Inc.


1. Major and graduate school intentions cont. Count of Grad School Intention Intention Major

No

Not sure

Yes

Grand Total

Accounting

1

1

14

16

Computing

1

1

3

5

Finance

1

3

3

7

1

4

5

Hospitality management International Business

2

6

6

14

Marketing

2

8

7

17

OR/Management science

2

5

7

Other

2

4

6

Retail management

1

1

5

7

Statistics or Analytics

3

5

5

13

Undecided/No major

3

15

Grand Total

14

45

18 56

115

Major and employment status Count of Employment

Employment

Major

F/T

Not

Accounting

5

Computing

1

Finance

2

Hospitality management

P/T 1

Grand Total

10

16

4

5

1

4

7

1

1

3

5

International Business

1

2

11

14

Marketing

3

6

8

17

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccxcv OR/Management science

1

2

4

7

Other

1

1

4

6

1

6

7

Retail management Statistics or Analytics

2

2

9

13

Undecided/No major

3

2

13

18

Grand Total

20

19

76

115

Copyright ©2024 Pearson Education, Inc.


1. Major and preferred instructional delivery mode cont. Count of Delivery Mode Preference Preference

Major

Hybrid

None

Accounting

3

Computing

2

Finance

2

3

Hospitality management

2.

Online asynch

Grand Physical Virtual Total 4

4

2

16

2

1

5

1

2

2

7

1

2

2

5

International Business

4

2

1

5

2

14

Marketing

4

3

3

5

2

17

OR/Management science

3

1

1

2

7

Other

2

Retail management

1

Statistics or Analytics

1

3

1

3

1

1

7

2

3

1

5

2

13

Undecided/No major

6

5

2

5

Grand Total

29

18

17

35

Each of the pairs are not statistically independent.

Copyright ©2024 Pearson Education, Inc.

6

18 16

115


Solutions to End-of-Section and Chapter Review Problems cccxcvii

Chapter 6

This case provides practice in determining whether the values of a numerical variable are normally distributed. Reports should, of course, exclude the attitudinal variables that may have been included the Chapter 3 report. Note that values for height were generated by a random normal function and are least ambiguous. Values for loans and GPA were generated by pairs of normal distributions. For loans, values for first-year students used one distribution, while all other classes used a second distribution. For GPA, values for accounting majors used one distribution, while all other majors used a second distribution. As a minimum. look for students to report that current credit hours is not normally distributed based on a non-linear normal probability plot

Copyright ©2024 Pearson Education, Inc.


1.

Age Mean

Height

Loans

21.9304 69.7739 32.8583

GPA

Current Credit Hours

Course Materials

Expected Starting Salary

3.3171

9.7739

157.9826

62.4957

Median

22

70

33.3

3.45

12

142

63

Mode

21

69

37

3.55

12

289

59

Minimum

17

64

3.4

1.91

1

31

12

Maximum

30

76

59.7

4.53

18

291

98

Range

13

12

56.3

2.62

17

260

86

Variance

5.8548

8.1941 75.2518

0.3157

29.2291

5220.7716

358.1118

Standard Deviation

2.4197

2.8625

8.6748

0.5619

5.4064

72.2549

18.9238

Coeff. of Variation

11.03%

4.10%

26.40%

16.94%

55.31%

45.74%

30.28%

Skewness

0.8503

-0.0286

-0.2290

-0.4800

-0.2737

0.3316

-0.3706

Kurtosis

1.1478

-0.6865

0.8657

-0.2074

-1.3071

-0.9278

-0.3672

Count

115

115

115

115

115

115

115

Standard Error

0.2256

0.2669

0.8089

0.0524

0.5041

6.7378

1.7647

First Quartile

20

67

27.3

2.96

5

107

50

Third Quartile

23

72

38.1

3.67

14

216

77

Interquartile Range

3

5

10.8

0.71

9

109

27

6*std dev

14.5180 17.1752 52.0487

3.3713

32.4384

433.5294

113.5431

1.33*std dev

3.2182

0.7473

7.1905

96.0990

25.1687

3.8072 11.5375

Age:

The mean is approximately equal to the median; the range is smaller than 6 times the standard deviation and the interquartile range is slightly smaller than 1.33 times the standard deviation.

Height:

The mean is approximately equal to the median; the range is smaller than 6 times the standard deviation and the interquartile range is larger than 1.33 times the standard deviation. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cccxcix Loans:

The mean is smaller than the median; the range is larger than 6 times the standard deviation and the interquartile range is smaller than 1.33 times the standard deviation.

GPA:

The mean is approximately equal to the median; the range is smaller than 6 times the standard deviation and the interquartile range is approximately equal to 1.33 times the standard deviation.

Hours:

The mean is smaller than the median; the range is much smaller than 6 times the standard deviation and the interquartile range is larger than 1.33 times the standard deviation.

Materials: The mean is larger than the median; the range is much smaller than 6 times the standard deviation and the interquartile range is larger than 1.33 times the standard deviation. Salary:

The mean is slightly smaller than the median; the range is smaller than 6 times the standard deviation and the interquartile range is larger than 1.33 times the standard deviation.

Copyright ©2024 Pearson Education, Inc.


2.

Normal Probability Plots

Age:

Height:

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdi 2. Normal Probability Plots cont. Loans:

GPA:

Copyright ©2024 Pearson Education, Inc.


2. Normal Probability Plots cont. Hours:

Materials:

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdiii 2. Normal Probability Plots cont. Salary:

3.

Age and height appear to be roughly normally distributed. Loans and GPA appear to be normally distributed. Current credit hours and course materials are not normally distributed. Expected starting salary appears to be left-skewed.

Copyright ©2024 Pearson Education, Inc.


Chapter 8

This case provides practice in computing confidence interval estimates. Note that the (undisclosed) true mean for the normal distribution from which values for height were chosen is 70, which is contained in the confidence interval estimate for height, 69.25 ≤  ≤ 70.30. 95% confidence interval of the means: Age: 21.48    22.38 Height: 69.25    70.30 Loans: 31.26    34.46 GPA: 3.21    3.42 Current Credit Hours: 8.78    10.77 Course Materials: 144.64    171.33 Expected Starting Salary: 59.00    65.99 95% confidence Interval Estimate for the Proportion: Status (sample size = 115) F/T (58) 0.4130    0.5957 P/T (57) 0.4043    0.5870 Class (sample size = 115) First-year (15) 0.0689    0.1920 Second-year (26) 0.1496    0.3025 Upper-level (74) 0.5559    0.7310 Transfer (sample size = 115) No (102) 0.8291    0.9448 Yes (13) 0.0552    0.1709 Major (sample size = 115) Accounting (16) 0.0759    0.2024 Computing (5) 0.0062    0.0808 Finance (7) 0.0172    0.1046 Hospitality Management (5)0.0062    0.0808 International Business (14)0.0620    0.1815 Marketing (17) 0.0830    0.2127 OR/Management science (7)0.0172    0.1046 Other (6) 0.0115    0.928 Retail management (7)0.0172    0.1046 Statistics or Analytics (13)0.0552    0.1709 Undecided/No major (18)0.0901    0.2229 Graduate School Intentions (sample size = 115) No (14) 0.0620    0.1815 Not sure (45) 0.3021    0.4805 Yes (56) 0.3956    0.5783 Assigned Sex (sample size = 115) Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdv Female (57) Male (58)

0.4043    0.5870 0.4130    0.5957

Copyright ©2024 Pearson Education, Inc.


95% confidence Interval Estimate for the Proportion: cont. Gender Identity (sample size = 115) Man (51) 0.3527    0.5343 Non-binary (9) 0.0292    0.1273 Prefer self-description (7)0.0172    0.1046 Woman (48) 0.3273    0.5075 Employment (sample size = 115) F/T (20) 0.1046    0.2432 Not (19) 0.0973    0.2331 P/T (76) 0.5743    0.7474 Delivery Mode Preference (sample size = 115) Hybrid (29) 0.1728    0.3315 None (18) 0.0901    0.2229 Online asynch (17) 0.0830    0.2127 Physical (35) 0.2203    0.3884 Virtual (16) 0.0759    0.2024 Food-Dining Satisfaction (sample size = 115) 1 (12) 0.0485    0.1602 2 (7) 0.0172    0.1046 3 (13) 0.0552    0.1709 4 (35) 0.2203    0.3884 5 (23) 0.1269    0.2731 6 (17) 0.0830    0.2127 7 (8) 0.0231    0.1161 Athletic Satisfaction (sample size = 115) 1 (3) –0.0030    0.0552 2 (6) 0.0115    0.928 3 (10) 0.0355    0.1385 4 (35) 0.2203    0.3884 5 (29) 0.1728    0.3315 6 (26) 0.1496    0.3025 7 (6) 0.0115    0.928 Student Support Satisfaction (sample size = 115) 1 (5) 0.0062    0.0808 2 (1) –0.0083    0.0257 3 (15) 0.0689    0.1920 4 (46) 0.3105    0.4895 5 (19) 0.0973    0.2331 6 (11) 0.0419    0.1494 7 (18) 0.0901    0.2229

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdvii 95% confidence Interval Estimate for the Proportion: cont. Development Center Visits (sample size = 115) 0 (17) 0.0830    0.2127 1 (24) 0.1344    0.2830 2 (9) 0.0292    0.1273 3 (6) 0.0115    0.928 4 (29) 0.1728    0.3315 5 (14) 0.0620    0.1815 6 (9) 0.0292    0.1273 7 (6) 0.0115    0.928 22 (1) –0.0083    0.0257 Recommended Course (sample size = 115) BUS 1000 (32) 0.1964    0.3602 COM 2150 (17) 0.0830    0.2127 INB 2700 (26) 0.1496    0.3025 INB 2800 (2) –0.0065    0.0413 (blank) (38) 0.2445    0.4164

You are 95% confident that the mean age is between 21.48 and 22.38. Similar statements can be made for the other confidence intervals of the means.

You are 95% confident that the proportion of the status: F/T is between 0.4130 and 0.5957, and P/T is between 0.4043 and 0.5870. Similar statements can be made for the other confidence intervals of the proportions.

Copyright ©2024 Pearson Education, Inc.


Chapter 10

Question 1 requires some data preparation by students. Perhaps the simplest method is to extract data for full-time students and part-time students separately, using filtering or sorting techniques. With the two data sets (two worksheets or two data tables), then for each variable to be analyzed, copy corresponding columns to a third worksheet or data table. Students using PHStat can (repeatedly) use Data Preparation  Unstack Data to prepare the two columns needed for each variable. More advanced students using a specific data analysis program may find equivalent ways that exploit special features of the program being used.) In Question 2, the ―no‖ and ―not sure‖ categories should be combined to form the category ―do not have current plans to attend graduate school.‖ This would be best solved by redefining the grad school intention column to have ―Yes‖ and ―Not Yes‖ values as a first data preparation step. Answers to both questions illustrate how proper data preparation simplifies performing statistical analysis, calling back observations the book makes in the earliest chapters. For courses that include group work, each group could be assigned to analyze one or a subset of the numerical variables that the questions use.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdix 1.Age: Population Larger Variance = P/T (57), Smaller Variance = F/T (58) H0: 12   22 The population variances are the same. H1: 12   22 The population variances are different. F Test for Differences in Two Variances

Data Level of Significance

0.05

Larger-Variance Sample Sample Size

57

Sample Variance

6.738721805

Smaller-Variance Sample Sample Size

58

Sample Variance

5.086509377

Intermediate Calculations F Test Statistic

1.3248

Population 1 Sample Degrees of Freedom

56

Population 2 Sample Degrees of Freedom

57

Two-Tail Test Upper Critical Value

1.6925

p-Value

0.2928

Do not reject the null hypothesis

Decision rule: If FSTAT > 1.6925, reject H0. S2 Test statistic: FSTAT  12 = 1.3248 S2 Copyright ©2024 Pearson Education, Inc.


Decision: Since FSTAT = 1.3248 < 1.6925 and the p-value = 0.2928 > 0.05, do not reject H0. There is not enough evidence to conclude that the two population variances are different. Hence, the appropriate test for the difference in two means is the pooled-variance t test.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdxi 1.Age: cont.Population 1 = F/T (58), 2 = P/T (57) H0: 1  2 H1: 1  2 Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference Level of Significance

0 0.05

Population 1 Sample Sample Size

58

Sample Mean

21.96551724

Sample Standard Deviation

2.255329106

Population 2 Sample Sample Size

57

Sample Mean

21.89473684

Sample Standard Deviation

2.595904814

Intermediate Calculations Population 1 Sample Degrees of Freedom

57

Population 2 Sample Degrees of Freedom

56

Total Degrees of Freedom

113

Pooled Variance

5.9053

Standard Error

0.4532

Difference in Sample Means

0.0708

t Test Statistic

0.1562

Copyright ©2024 Pearson Education, Inc.


Two-Tail Test Lower Critical Value

-1.9812

Upper Critical Value

1.9812

p-Value

0.8762

Do not reject the null hypothesis

Decision: Since tSTAT = 0.1562 < 1.9812 and the p-value = 0.8762 > 0.05, do not reject H0. There is not sufficient evidence to conclude that the mean age is different between part-time and fulltime students.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdxiii 1.GPA: cont.Population Larger Variance = P/T (57), Smaller Variance = F/T (58) H0: 12   22 The population variances are the same. H1: 12   22 The population variances are different. F Test for Differences in Two Variances

Data Level of Significance

0.05

Larger-Variance Sample Sample Size

57

Sample Variance

0.325556955

Smaller-Variance Sample Sample Size

58

Sample Variance

0.311323291

Intermediate Calculations F Test Statistic

1.0457

Population 1 Sample Degrees of Freedom

56

Population 2 Sample Degrees of Freedom

57

Two-Tail Test Upper Critical Value

1.6925

p-Value

0.8665

Do not reject the null hypothesis

Decision rule: If FSTAT > 1.6925, reject H0. S2 Test statistic: FSTAT  12 = 1.0457 S2 Copyright ©2024 Pearson Education, Inc.


Decision: Since FSTAT = 1.0457 < 1.6925 and the p-value = 0.8665 > 0.05, do not reject H0. There is not enough evidence to conclude that the two population variances are different. Hence, the appropriate test for the difference in two means is the pooled-variance t test.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdxv 1.GPA: cont.Population 1 = F/T (58), 2 = P/T (57) H0: 1  2 H1: 1  2 Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference Level of Significance

0 0.05

Population 1 Sample Sample Size

58

Sample Mean

3.328275862

Sample Standard Deviation

0.557963521

Population 2 Sample Sample Size

57

Sample Mean

3.305789474

Sample Standard Deviation

0.570575985

Intermediate Calculations Population 1 Sample Degrees of Freedom

57

Population 2 Sample Degrees of Freedom

56

Total Degrees of Freedom

113

Pooled Variance

0.3184

Standard Error

0.1052

Difference in Sample Means

0.0225

t Test Statistic

0.2137

Copyright ©2024 Pearson Education, Inc.


Two-Tail Test Lower Critical Value

-1.9812

Upper Critical Value

1.9812

p-Value

0.8312

Do not reject the null hypothesis

Decision: Since tSTAT = 0.2137 < 1.9812 and the p-value = 0.8312 > 0.05, do not reject H0. There is not sufficient evidence to conclude that the mean GPA is different between part-time and fulltime students.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdxvii 1.Amount of current outstanding student loans: cont.Population Larger Variance = P/T (57), Smaller Variance = F/T (58) H0: 12   22 The population variances are the same. H1: 12   22 The population variances are different. F Test for Differences in Two Variances

Data Level of Significance

0.05

Larger-Variance Sample Sample Size

57

Sample Variance

81.34800752

Smaller-Variance Sample Sample Size

58

Sample Variance

69.10416213

Intermediate Calculations F Test Statistic

1.1772

Population 1 Sample Degrees of Freedom

56

Population 2 Sample Degrees of Freedom

57

Two-Tail Test Upper Critical Value

1.6925

p-Value

0.5412

Do not reject the null hypothesis

Decision rule: If FSTAT > 1.6925, reject H0. S2 Test statistic: FSTAT  12 = 1.1772 S2 Copyright ©2024 Pearson Education, Inc.


Decision: Since FSTAT = 1.1772 < 1.6925 and the p-value = 0.5412 > 0.05, do not reject H0. There is not enough evidence to conclude that the two population variances are different. Hence, the appropriate test for the difference in two means is the pooled-variance t test.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdxix 1.Amount of current outstanding current loans: cont.Population 1 = F/T (58), 2 = P/T (57) H0: 1  2 H1: 1  2 Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference Level of Significance

0 0.05

Population 1 Sample Sample Size

58

Sample Mean

33.70689655

Sample Standard Deviation

8.312891322

Population 2 Sample Sample Size

57

Sample Mean

31.99473684

Sample Standard Deviation

9.019313029

Intermediate Calculations Population 1 Sample Degrees of Freedom

57

Population 2 Sample Degrees of Freedom

56

Total Degrees of Freedom

113

Pooled Variance

75.1719

Standard Error

1.6171

Difference in Sample Means

1.7122

t Test Statistic

1.0588

Copyright ©2024 Pearson Education, Inc.


Two-Tail Test Lower Critical Value

-1.9812

Upper Critical Value

1.9812

p-Value

0.2919

Do not reject the null hypothesis

Decision: Since tSTAT = 1.0588 < 1.9812 and the p-value = 0.2919 > 0.05, do not reject H0. There is not sufficient evidence to conclude that the mean amount of current outstanding student loans is different between part-time and full-time students.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdxxi 1.Amount spent on course materials: cont.Population Larger Variance = F/T (58), Smaller Variance = P/T (57) H0: 12   22 The population variances are the same. H1: 12   22 The population variances are different. F Test for Differences in Two Variances

Data Level of Significance

0.05

Larger-Variance Sample Sample Size

58

Sample Variance

2837.317604

Smaller-Variance Sample Sample Size

57

Sample Variance

1028.684211

Intermediate Calculations F Test Statistic

2.7582

Population 1 Sample Degrees of Freedom

57

Population 2 Sample Degrees of Freedom

56

Two-Tail Test Upper Critical Value

1.6946

p-Value

0.0002 Reject the null hypothesis

Decision rule: If FSTAT > 1.6946, reject H0. S2 Test statistic: FSTAT  12 = 2.7582 S2 Copyright ©2024 Pearson Education, Inc.


Decision: Since FSTAT = 2.7582 > 1.6946 and the p-value = 0.0002 < 0.05, reject H0. There is enough evidence to conclude that the two population variances are different. Hence, the appropriate test for the difference in two means is the separate-variance t test.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdxxiii 1.Amount spent on course materials: cont.Population 1 = F/T (58), 2 = P/T (57) H0: 1  2 H1: 1  2 Separate-Variances t Test for the Difference Between Two Means (assumes unequal population variances) Data Hypothesized Difference

0

Level of Significance

0.05

Population 1 Sample Sample Size

58

Sample Mean

214.6551724

Sample Standard Deviation

53.2665

Population 2 Sample Sample Size

57

Sample Mean

100.3157895

Sample Standard Deviation

32.0731

Intermediate Calculations Numerator of Degrees of Freedom

4484.4934

Denominator of Degrees of Freedom

47.8001

Total Degrees of Freedom

93.8176

Degrees of Freedom

93

Standard Error

8.1833

Difference in Sample Means

114.3394

Separate-Variance t Test Statistic

13.9723

Copyright ©2024 Pearson Education, Inc.


Two-Tail Test Lower Critical Value

-1.9858

Upper Critical Value

1.9858

p-Value

0.0000 Reject the null hypothesis

Decision: Since tSTAT = 13.9723 > 1.9858 and the p-value = 0.0000 < 0.05, reject H0. There is sufficient evidence to conclude that the mean amount spent on course materials is different between part-time and full-time students.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdxxv 1.Expected annual starting salary: cont.Population Larger Variance = P/T (57), Smaller Variance = F/T (58) H0: 12   22 The population variances are the same. H1: 12   22 The population variances are different. F Test for Differences in Two Variances

Data Level of Significance

0.05

Larger-Variance Sample Sample Size

58

Sample Variance

370.03902

Smaller-Variance Sample Sample Size

57

Sample Variance

352.3552632

Intermediate Calculations F Test Statistic

1.0502

Population 1 Sample Degrees of Freedom

57

Population 2 Sample Degrees of Freedom

56

Two-Tail Test Upper Critical Value

1.6946

p-Value

0.8553

Do not reject the null hypothesis

Decision rule: If FSTAT > 1.6946, reject H0. S2 Test statistic: FSTAT  12 = 1.0502 S2 Copyright ©2024 Pearson Education, Inc.


Decision: Since FSTAT = 1.0502 < 1.6946 and the p-value = 0.8553 > 0.05, do not reject H0. There is not enough evidence to conclude that the two population variances are different. Hence, the appropriate test for the difference in two means is the pooled-variance t test.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdxxvii 1.Expected annual starting salary: cont.Population 1 = F/T (58), 2 = P/T (57) H0: 1  2 H1: 1  2 Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference Level of Significance

0 0.05

Population 1 Sample Sample Size

58

Sample Mean

62.56896552

Sample Standard Deviation

19.23639831

Population 2 Sample Sample Size

57

Sample Mean

62.42105263

Sample Standard Deviation

18.77112845

Intermediate Calculations Population 1 Sample Degrees of Freedom

57

Population 2 Sample Degrees of Freedom

56

Total Degrees of Freedom

113

Pooled Variance

361.2754

Standard Error

3.5450

Difference in Sample Means

0.1479

t Test Statistic

0.0417

Copyright ©2024 Pearson Education, Inc.


Two-Tail Test Lower Critical Value

-1.9812

Upper Critical Value

1.9812

p-Value

0.9668

Do not reject the null hypothesis

Decision: Since tSTAT = 0.0417 < 1.9812 and the p-value = 0.9668 > 0.05, do not reject H0. There is not sufficient evidence to conclude that the mean age is different between part-time and fulltime students.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdxxix 2.Age: Population Larger Variance = Yes (56), Smaller Variance = Not yes (59) H0: 12   22 The population variances are the same. H1: 12   22 The population variances are different. F Test for Differences in Two Variances

Data Level of Significance

0.05

Larger-Variance Sample Sample Size

56

Sample Variance

7.361038961

Smaller-Variance Sample Sample Size

59

Sample Variance

4.184687317

Intermediate Calculations F Test Statistic

1.7590

Population 1 Sample Degrees of Freedom

55

Population 2 Sample Degrees of Freedom

58

Two-Tail Test Upper Critical Value

1.6907

p-Value

0.0352 Reject the null hypothesis

Decision rule: If FSTAT > 1.6907, reject H0. S2 Test statistic: FSTAT  12 = 1.7590 S2 Copyright ©2024 Pearson Education, Inc.


Decision: Since FSTAT = 1.7590 > 1.6907 and the p-value = 0.0352 < 0.05, reject H0. There is enough evidence to conclude that the two population variances are different. Hence, the appropriate test for the difference in two means is the separate-variance t test.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdxxxi 2.Age: cont.Population 1 = Not yes (59), 2 = Yes (56) H0: 1  2 H1: 1  2 Separate-Variances t Test for the Difference Between Two Means (assumes unequal population variances) Data Hypothesized Difference

0

Level of Significance

0.05

Population 1 Sample Sample Size

59

Sample Mean

21.52542373

Sample Standard Deviation

2.0457

Population 2 Sample Sample Size

56

Sample Mean

22.35714286

Sample Standard Deviation

2.7131

Intermediate Calculations Numerator of Degrees of Freedom

0.0410

Denominator of Degrees of Freedom

0.0004

Total Degrees of Freedom

102.1617

Degrees of Freedom

102

Standard Error

0.4499

Difference in Sample Means

-0.8317

Separate-Variance t Test Statistic

-1.8488

Copyright ©2024 Pearson Education, Inc.


Two-Tail Test Lower Critical Value

-1.9835

Upper Critical Value

1.9835

p-Value

0.0674

Do not reject the null hypothesis Decision: Since tSTAT = –1.8488 > –1.9835 and the p-value = 0.0674 > 0.05, do not reject H0. There is not sufficient evidence to conclude that the mean age is different between students who have current plans to attend graduate school and those who do not.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdxxxiii 2.GPA: cont.Population Larger Variance = Not yes (59), Smaller Variance = Yes (56) H0: 12   22 The population variances are the same. H1: 12   22 The population variances are different. F Test for Differences in Two Variances

Data Level of Significance

0.05

Larger-Variance Sample Sample Size

59

Sample Variance

0.362404442

Smaller-Variance Sample Sample Size

56

Sample Variance

0.262032955

Intermediate Calculations F Test Statistic

1.3830

Population 1 Sample Degrees of Freedom

58

Population 2 Sample Degrees of Freedom

55

Two-Tail Test Upper Critical Value

1.6970

p-Value

0.2277

Do not reject the null hypothesis

Decision rule: If FSTAT > 1.6970, reject H0. S2 Test statistic: FSTAT  12 = 1.3830 S2 Copyright ©2024 Pearson Education, Inc.


Decision: Since FSTAT = 1.3830 < 1.6970 and the p-value = 0.2277 > 0.05, do not reject H0. There is not enough evidence to conclude that the two population variances are different. Hence, the appropriate test for the difference in two means is the pooled-variance t test.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdxxxv 2.GPA: cont.Population 1 = Not yes (59), 2 = Yes (56) H0: 1  2 H1: 1  2 Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference Level of Significance

0 0.05

Population 1 Sample Sample Size

59

Sample Mean

3.249152542

Sample Standard Deviation

0.602000367

Population 2 Sample Sample Size

56

Sample Mean

3.38875

Sample Standard Deviation

0.511891546

Intermediate Calculations Population 1 Sample Degrees of Freedom

58

Population 2 Sample Degrees of Freedom

55

Total Degrees of Freedom

113

Pooled Variance

0.3136

Standard Error

0.1045

Difference in Sample Means

-0.1396

t Test Statistic

-1.3363

Copyright ©2024 Pearson Education, Inc.


Two-Tail Test Lower Critical Value

-1.9812

Upper Critical Value

1.9812

p-Value

0.1841

Do not reject the null hypothesis Decision: Since tSTAT = –1.3363 > –1.9812 and the p-value = 0.1841 > 0.05, do not reject H0. There is not sufficient evidence to conclude that the mean GPA is different between students who have current plans to attend graduate school and those who do not.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdxxxvii 2.Amount of current outstanding student loans: cont.Population Larger Variance = Not yes (59), Smaller Variance = Yes (56) H0: 12   22 The population variances are the same. H1: 12   22 The population variances are different. F Test for Differences in Two Variances

Data Level of Significance

0.05

Larger-Variance Sample Sample Size

59

Sample Variance

76.45322034

Smaller-Variance Sample Sample Size

56

Sample Variance

75.23178896

Intermediate Calculations F Test Statistic

1.0162

Population 1 Sample Degrees of Freedom

58

Population 2 Sample Degrees of Freedom

55

Two-Tail Test Upper Critical Value

1.6970

p-Value

0.9538

Do not reject the null hypothesis

Decision rule: If FSTAT > 1.6970, reject H0. S2 Test statistic: FSTAT  12 = 1.0162 S2 Copyright ©2024 Pearson Education, Inc.


Decision: Since FSTAT = 1.0162 < 1.6970 and the p-value = 0.9538 > 0.05, do not reject H0. There is not enough evidence to conclude that the two population variances are different. Hence, the appropriate test for the difference in two means is the pooled-variance t test.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdxxxix 2.Amount of current outstanding current loans: cont.Population 1 = Not yes (59), 2 = Yes (56) H0: 1  2 H1: 1  2 Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference Level of Significance

0 0.05

Population 1 Sample Sample Size

59

Sample Mean

32.62372881

Sample Standard Deviation

8.743753218

Population 2 Sample Sample Size

56

Sample Mean

33.10535714

Sample Standard Deviation

8.673626056

Intermediate Calculations Population 1 Sample Degrees of Freedom

58

Population 2 Sample Degrees of Freedom

55

Total Degrees of Freedom

113

Pooled Variance

75.8587

Standard Error

1.6249

Difference in Sample Means

-0.4816

t Test Statistic

-0.2964

Copyright ©2024 Pearson Education, Inc.


Two-Tail Test Lower Critical Value

-1.9812

Upper Critical Value

1.9812

p-Value

0.7675

Do not reject the null hypothesis Decision: Since tSTAT = –0.2964 > –1.9812 and the p-value = 0.7675 > 0.05, do not reject H0. There is not sufficient evidence to conclude that the mean amount of current outstanding student loans is different between students who have current plans to attend graduate school and those who do not.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdxli 2.Amount spent on course materials: cont.Population Larger Variance = Yes (56), Smaller Variance = Not yes (59) H0: 12   22 The population variances are the same. H1: 12   22 The population variances are different. F Test for Differences in Two Variances

Data Level of Significance

0.05

Larger-Variance Sample Sample Size

56

Sample Variance

5581.506169

Smaller-Variance Sample Sample Size

59

Sample Variance

4902.552309

Intermediate Calculations F Test Statistic

1.1385

Population 1 Sample Degrees of Freedom

55

Population 2 Sample Degrees of Freedom

58

Two-Tail Test Upper Critical Value

1.6907

p-Value

0.6258

Do not reject the null hypothesis

Decision rule: If FSTAT > 1.6907, reject H0. S2 Test statistic: FSTAT  12 = 1.1385 S2 Copyright ©2024 Pearson Education, Inc.


Decision: Since FSTAT = 1.1385 < 1.6907 and the p-value = 0.6258 > 0.05, do not reject H0. There is not enough evidence to conclude that the two population variances are different. Hence, the appropriate test for the difference in two means is the pooled-variance t test.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdxliii 2.Amount spent on course materials: cont.Population 1 = Not yes (59), 2 = Yes (56) H0: 1  2 H1: 1  2 Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference

0

Level of Significance

0.05

Population 1 Sample Sample Size

59

Sample Mean

163.6101695

Sample Standard Deviation

70.0182284

Population 2 Sample Sample Size

56

Sample Mean

152.0535714

Sample Standard Deviation

74.70947844

Intermediate Calculations Population 1 Sample Degrees of Freedom

58

Population 2 Sample Degrees of Freedom

55

Total Degrees of Freedom

113

Pooled Variance

5233.0166

Standard Error

13.4960

Difference in Sample Means

11.5566

t Test Statistic

0.8563

Copyright ©2024 Pearson Education, Inc.


Two-Tail Test Lower Critical Value

-1.9812

Upper Critical Value

1.9812

p-Value

0.3936

Do not reject the null hypothesis

Decision: Since tSTAT = 0.8563 < 1.9812 and the p-value = 0.3936 > 0.05, do not reject H0. There is not sufficient evidence to conclude that the mean amount spent on course materials is different between students who have current plans to attend graduate school and those who do not.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdxlv 2.Expected annual starting salary: cont.Population Larger Variance = Yes (56), Smaller Variance = Not yes (59) H0: 12   22 The population variances are the same. H1: 12   22 The population variances are different. F Test for Differences in Two Variances

Data Level of Significance

0.05

Larger-Variance Sample Sample Size

56

Sample Variance

380.5077922

Smaller-Variance Sample Sample Size

59

Sample Variance

342.635301

Intermediate Calculations F Test Statistic

1.1105

Population 1 Sample Degrees of Freedom

55

Population 2 Sample Degrees of Freedom

58

Two-Tail Test Upper Critical Value

1.6907

p-Value

0.6931

Do not reject the null hypothesis

Decision rule: If FSTAT > 1.6907, reject H0. S2 Test statistic: FSTAT  12 = 1.1105 S2 Copyright ©2024 Pearson Education, Inc.


Decision: Since FSTAT = 1.1105 < 1.6907 and the p-value = 0.6931 > 0.05, do not reject H0. There is not enough evidence to conclude that the two population variances are different. Hence, the appropriate test for the difference in two means is the pooled-variance t test.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdxlvii 2.Expected annual starting salary: cont.Population 1 = Not yes (59), 2 = Yes (56) H0: 1  2 H1: 1  2 Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference Level of Significance

0 0.05

Population 1 Sample Sample Size

59

Sample Mean

62.05084746

Sample Standard Deviation

18.51041061

Population 2 Sample Sample Size

56

Sample Mean

62.96428571

Sample Standard Deviation

19.50660894

Intermediate Calculations Population 1 Sample Degrees of Freedom

58

Population 2 Sample Degrees of Freedom

55

Total Degrees of Freedom

113

Pooled Variance

361.0688

Standard Error

3.5451

Difference in Sample Means

-0.9134

t Test Statistic

-0.2577

Copyright ©2024 Pearson Education, Inc.


Two-Tail Test Lower Critical Value

-1.9812

Upper Critical Value

1.9812

p-Value

0.7971

Do not reject the null hypothesis Decision: Since tSTAT = –0.2577 > –1.9812 and the p-value = 0.7971 > 0.05, do not reject H0. There is not sufficient evidence to conclude that the mean expected annual salary is different between students who have current plans to attend graduate school and those who do not.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdxlix

Chapter 11

This case extends the Chapter 10 case and requires the data preparation done for that case. Recall, the ―no‖ and ―not sure‖ categories should be combined to form the category ―do not have current plans to attend graduate school.‖ This would be best solved by redefining the grad school intention column to have ―Yes‖ and ―Not Yes‖ values as a first data preparation step. NOTE: In initial printings, question 1 lists the variable spending on textbooks and supplies, text messages sent in a week, and the wealth needed to feel rich, an editing error. Those three variables should be replaced by these two, amount spent on course materials and total number of credit hours this semester, which will make the question consistent to the (correct) question 2. 1.

Salary (expected starting annual salary), based on academic major H0: 12   22   32   42   52   62   72   82   92  102  112 ; H1: At least one variance is different. Levene’s test for equality of variances: SUMMARY Groups

Count

Sum

Average

Variance

Accounting

16

218

13.625

175.9833

Computing

5

92

18.4

293.3000

Finance

7

70

10

79.0000

Hospitality management

5

47

9.4

71.3000

International Business

14

64

4.571428571

13.6484

Marketing

17

144

8.470588235

48.6397

OR/Management science

7

28

4

13.0000

Other

6

62

10.33333333

58.5667

Retail management

7

101

14.42857143

104.6190

Statistics or Analytics

13

158

12.15384615

66.6410

Undecided/No major

18

246

13.66666667

103.1765

ANOVA Source of Variation

SS

df

MS

Copyright ©2024 Pearson Education, Inc.

F

P-value

F crit


Between Groups

1653.7940

10

165.3794

Within Groups

9080.0538

104

87.3082

Total

10733.8478

114

1.8942

0.0540 1.9229

Level of significance

0.05

Since p-value = 0.0540 > 0.05, do not reject H0. There is not enough evidence that the equal variance assumption needed for the one-way ANOVA F test for the difference in the means is violated. Hence, the one-way ANOVA F test for the difference in the mean can be used.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdli 1. Salary (expected starting annual salary), based on academic major cont. H0: 1  2  3  4  5  6  7  8  9  10  11 ; H1: At least one mean is different. Decision rule: df: 10,104. If F > 1.9229, reject H0. One-way ANOVA F test for difference in the means: ANOVA: Single Factor SUMMARY Groups

Count

Sum

Average

Variance

Accounting

16

786

49.125

359.9833

Computing

5

218

43.6

644.3000

Finance

7

576

82.28571429

157.5714

Hospitality management

5

357

71.4

167.3000

International Business

14

1170

83.57142857

33.4945

Marketing

17

1114

65.52941176

124.6397

OR/Management science

7

443

63.28571429

31.5714

Other

6

384

64

186.4000

Retail management

7

467

66.71428571

345.5714

Statistics or Analytics

13

798

61.38461538

226.2564

Undecided/No major

18

874

48.55555556

298.7320

ANOVA Source of Variation

SS

df

MS

Between Groups

17815.1269

10

1781.5127

Within Groups

23009.6209

104

221.2464

Total

40824.7478

114

F 8.0522

P-value

F crit

0.0000 1.9229

Level of significance

0.05

Test statistic: FSTAT = 8.0522 Since p-value = 0.0000 < 0.05, and FSTAT = 8.0522 > 1.9229, reject H0. There is enough evidence to conclude that there is a significant difference expected starting annual salary across the majors. Copyright ©2024 Pearson Education, Inc.


Using Tukey-Kramer Multiple Comparisons in PHStat, the procedure, there is a difference in the mean expected starting annual salary between the following majors: Accounting and Finance Accounting and International Business Computing and Finance Computing and International Business Finance and Statistics or Analytics International Business and Marketing International Business and Statistics or Analytics International Business and Undecided/No major Marketing and Undecided/No major

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdliii 1. Age, based on academic major cont. H0: 12   22   32   42   52   62   72   82   92  102  112 ; H1: At least one variance is different. Levene’s test for equality of variances: SUMMARY Groups

Count

Sum

Average

Variance

Accounting

16

29

1.8125

7.2292

Computing

5

3

0.6

0.3000

Finance

7

15

2.142857143

2.4762

Hospitality management

5

8

1.6

1.8000

International Business

14

20

1.428571429

1.9560

Marketing

17

28

1.647058824

2.4926

OR/Management science

7

13

1.857142857

7.1429

Other

6

13

2.166666667

3.8667

Retail management

7

12

1.714285714

3.2381

Statistics or Analytics

13

16

1.230769231

1.1923

Undecided/No major

18

35

1.944444444

0.9673

ANOVA Source of Variation

SS

df

MS

F

Between Groups

14.0667

10

1.4067

Within Groups

309.3768

104

2.9748

Total

323.4435

114

0.4729

P-value

F crit

0.9041 1.9229

Level of significance

0.05

Since p-value = 0.9041 > 0.05, do not reject H0. There is not enough evidence that the equal variance assumption needed for the one-way ANOVA F test for the difference in the means is violated. Hence, the one-way ANOVA F test for the difference in the mean can be used.

Copyright ©2024 Pearson Education, Inc.


1. Age, based on academic major cont. H0: 1  2  3  4  5  6  7  8  9  10  11 ; Decision rule: df: 10,104. If F > 1.9229, reject H0.

H1: At least one mean is different.

One-way ANOVA F test for difference in the means: ANOVA: Single Factor SUMMARY Groups

Count

Sum

Average

Variance

Accounting

16

365

22.8125

10.0292

Computing

5

111

22.2

0.7000

Finance

7

161

23

6.6667

Hospitality management

5

115

23

5.0000

International Business

14

294

21

4.1538

Marketing

17

367

21.58823529

5.0074

OR/Management science

7

149

21.28571429

9.2381

Other

6

129

21.5

8.3000

Retail management

7

165

23.57142857

6.2857

Statistics or Analytics

13

279

21.46153846

2.6026

Undecided/No major

18

387

21.5

4.9706

ANOVA Source of Variation

SS

df

MS

F

Between Groups

69.7147

10

6.9715

Within Groups

597.7288

104

5.7474

Total

667.4435

114

1.2130

P-value

F crit

0.2914 1.9229

Level of significance

0.05

Test statistic: FSTAT = 1.2130 Since p-value = 0.2914 > 0.05, and FSTAT = 1.2130 < 1.9229, do not reject H0. There is insufficient evidence to conclude that the ages across the majors are different. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdlv 1. Materials (amount spent on course materials), based on academic major cont. H0: 12   22   32   42   52   62   72   82   92  102  112 ; H1: At least one variance is different. Levene’s test for equality of variances: SUMMARY Groups

Count

Sum

Average

Variance

Accounting

16

954

59.625

2107.7167

Computing

5

348

69.6

1653.3000

Finance

7

412

58.85714286

3783.4762

Hospitality management

5

201

40.2

2738.2000

International Business

14

908

64.85714286

1628.5934

Marketing

17

991

58.29411765

2323.0956

OR/Management science

7

317

45.28571429

3304.2381

Other

6

398

66.33333333

3490.1667

Retail management

7

295

42.14285714

3236.8095

Statistics or Analytics

13

694

53.38461538

2409.4231

Undecided/No major

18

915

50.83333333

1996.5000

ANOVA Source of Variation

SS

df

MS

Between Groups

6985.5271

10

698.5527

Within Groups

249774.5468

104

2401.6783

Total

256760.0739

114

F 0.2909

P-value

F crit

0.9819 1.9229

Level of significance

0.05

Since p-value = 0.9819 > 0.05, do not reject H0. There is not enough evidence that the equal variance assumption needed for the one-way ANOVA F test for the difference in the means is violated. Hence, the one-way ANOVA F test for the difference in the mean can be used.

Copyright ©2024 Pearson Education, Inc.


1. Materials (amount spent on course materials), based on academic major cont. H0: 1  2  3  4  5  6  7  8  9  10  11 ; H1: At least one mean is different. Decision rule: df: 10,104. If F > 1.9229, reject H0. One-way ANOVA F test for difference in the means: ANOVA: Single Factor SUMMARY Groups

Count

Sum

Average

Variance

Accounting

16

2532

158.25

5635.2667

Computing

5

1016

203.2

7611.7000

Finance

7

1031

147.2857143

6891.5714

Hospitality management

5

871

174.2

2972.2000

International Business

14

2180

155.7142857

6010.8352

Marketing

17

2725

160.2941176

5651.5956

OR/Management science

7

1179

168.4285714

3869.9524

Other

6

1178

196.3333333

8180.6667

Retail management

7

819

117

3879.6667

Statistics or Analytics

13

1874

144.1538462

5056.8077

Undecided/No major

18

2763

153.5

4329.9118

ANOVA Source of Variation

SS

df

MS

Between Groups

36696.3102

10

3669.6310

Within Groups

558471.6551

104

5369.9198

Total

595167.9652

114

F 0.6834

P-value

F crit

0.7377 1.9229

Level of significance

0.05

Test statistic: FSTAT = 0.6831 Since p-value = 0.7377 > 0.05, and FSTAT = 0.6831 < 1.9229, do not reject H0. There is insufficient evidence to conclude that amount spend on course materials across the majors are different. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdlvii 1. Hours (total number of credit hours this semester), based on academic major cont. H0: 12   22   32   42   52   62   72   82   92  102  112 ; H1: At least one variance is different. Levene’s test for equality of variances: SUMMARY Groups

Count

Sum

Average

Variance

Accounting

16

81

5.0625

4.4625

Computing

5

18

3.6

29.3000

Finance

7

29

4.142857143

15.1429

Hospitality management

5

11

2.2

7.7000

International Business

14

65

4.642857143

11.0165

Marketing

17

80

4.705882353

4.2206

OR/Management science

7

23

3.285714286

14.5714

Other

6

21

3.5

9.6000

Retail management

7

27

3.857142857

6.4762

Statistics or Analytics

13

60

4.615384615

15.7564

Undecided/No major

18

85

4.722222222

10.3007

ANOVA Source of Variation

SS

df

MS

F

Between Groups

55.0749

10

5.5075

Within Groups

1055.0121

104

10.1443

Total

1110.0870

114

0.5429

P-value

F crit

0.8559 1.9229

Level of significance

0.05

Since p-value = 0.8559 > 0.05, do not reject H0. There is not enough evidence that the equal variance assumption needed for the one-way ANOVA F test for the difference in the means is violated. Hence, the one-way ANOVA F test for the difference in the mean can be used.

Copyright ©2024 Pearson Education, Inc.


1. Hours (total number of credit hours this semester), based on academic major cont. H0: 1  2  3  4  5  6  7  8  9  10  11 ; H1: At least one mean is different. Decision rule: df: 10,104. If F > 1.9229, reject H0. One-way ANOVA F test for difference in the means: ANOVA: Single Factor SUMMARY Groups

Count

Sum

Average

Variance

Accounting

16

143

8.9375

31.7958

Computing

5

66

13.2

35.7000

Finance

7

67

9.571428571

28.2857

Hospitality management

5

70

14

12.5000

International Business

14

135

9.642857143

32.2473

Marketing

17

143

8.411764706

27.3824

OR/Management science

7

75

10.71428571

25.2381

Other

6

79

13.16666667

22.1667

Retail management

7

56

8

22.6667

Statistics or Analytics

13

103

7.923076923

28.5769

Undecided/No major

18

187

10.38888889

32.6046

ANOVA Source of Variation

SS

df

MS

Between Groups

339.8753

10

33.9875

Within Groups

2992.2465

104

28.7716

Total

3332.1217

114

F 1.1813

P-value

F crit

0.3119 1.9229

Level of significance

0.05

Test statistic: FSTAT = 1.1813 Since p-value = 0.3119 > 0.05, and FSTAT = 1.1813 < 1.9229, do not reject H0. There is insufficient evidence to conclude that the total number of credit hours this semester across the majors are different. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdlix

Copyright ©2024 Pearson Education, Inc.


2.

GPA, based on graduate school intention H0: 12   22 H1: At least one variance is different. Levene’s test for equality of variances: SUMMARY Groups

Count

Sum

Average

Variance

Yes

56 22.25 0.397321429

0.1188

Not Yes

59 27.62 0.468135593

0.1479

ANOVA Source of Variation

SS

df

MS

F

Between Groups

0.1441

1

0.1441

Within Groups

15.1126

113

0.1337

Total

15.2567

114

P-value

1.0773

F crit

0.3015 3.9251

Level of significance

0.05

Since p-value = 0.3015 > 0.05, do not reject H0. There is not enough evidence that the equal variance assumption needed for the one-way ANOVA F test for the difference in the means is violated. Hence, the one-way ANOVA F test for the difference in the mean can be used. H0: 1  2 H1: At least one mean is different. Decision rule: df: 1,113. If F > 3.9251, reject H0. One-way ANOVA F test for difference in the means: ANOVA: Single Factor SUMMARY Groups

Count

Sum

Average

Variance

Yes

56

189.77

3.38875

0.2620

Not Yes

59

191.7

3.249152542

0.3624

df

MS

F

ANOVA Source of Variation

SS

Copyright ©2024 Pearson Education, Inc.

P-value

F crit


Solutions to End-of-Section and Chapter Review Problems cdlxi Between Groups

0.5599

1

0.5599

Within Groups

35.4313

113

0.3136

Total

35.9912

114

1.7856

0.1841

3.9251

Level of significance

0.05

Test statistic: FSTAT = 1.7856 Since p-value = 0.1841 > 0.05, and FSTAT = 1.7856 < 3.9251, do not reject H0. There is insufficient evidence to conclude that GPA across the graduate school intentions are different.

Copyright ©2024 Pearson Education, Inc.


2. Salary (expected starting annual salary), based on graduate school intention cont. H0: 12   22 H1: At least one variance is different. Levene’s test for equality of variances: SUMMARY Groups

Count

Sum

Average

Variance

Yes

56

884 15.78571429 129.1896

Not Yes

59

892 15.11864407 111.0374

ANOVA Source of Variation

SS

df

MS

F

Between Groups

12.7845

1

12.7845

0.1067

Within Groups

13545.5981

113

119.8725

Total

13558.3826

114

P-value

F crit

0.7446 3.9251

Level of significance

0.05

Since p-value = 0.7446 > 0.05, do not reject H0. There is not enough evidence that the equal variance assumption needed for the one-way ANOVA F test for the difference in the means is violated. Hence, the one-way ANOVA F test for the difference in the mean can be used. H0: 1  2 H1: At least one mean is different. Decision rule: df: 1,113. If F > 3.9251, reject H0. One-way ANOVA F test for difference in the means: ANOVA: Single Factor SUMMARY Groups

Count

Sum

Average

Variance

Yes

56

3526 62.96428571 380.5078

Not Yes

59

3661 62.05084746 342.6353

ANOVA Source of Variation

SS

df

MS

Copyright ©2024 Pearson Education, Inc.

F

P-value

F crit


Solutions to End-of-Section and Chapter Review Problems cdlxiii Between Groups

23.9718

1

23.9718

Within Groups

40800.7760

113

361.0688

Total

40824.7478

114

0.0664

0.7971 3.9251

Level of significance

0.05

Test statistic: FSTAT = 0.664 Since p-value = 0.7971 > 0.05, and FSTAT = 0.7971 < 3.9251, do not reject H0. There is insufficient evidence to conclude that expected starting annual salary across the graduate school intentions are different.

Copyright ©2024 Pearson Education, Inc.


2. Age, based on graduate school intention cont. H0: 12   22 H1: At least one variance is different. Levene’s test for equality of variances: SUMMARY Groups

Count

Sum

Average

Variance

Yes

56

112

2

3.4182

Not Yes

59

101 1.711864407

1.4845

df

MS

F 0.9833

ANOVA Source of Variation

SS

Between Groups

2.3853

1

2.3853

Within Groups

274.1017

113

2.4257

Total

276.4870

114

P-value

F crit

0.3235 3.9251

Level of significance

0.05

Since p-value = 0.3235 > 0.05, do not reject H0. There is not enough evidence that the equal variance assumption needed for the one-way ANOVA F test for the difference in the means is violated. Hence, the one-way ANOVA F test for the difference in the mean can be used. H0: 1  2 H1: At least one mean is different. Decision rule: df: 1,113. If F > 3.9251, reject H0. One-way ANOVA F test for difference in the means: ANOVA: Single Factor SUMMARY Groups

Count

Sum

Average

Variance

Yes

56

1252 22.35714286

7.3610

Not Yes

59

1270 21.52542373

4.1847

df

F

ANOVA Source of Variation

SS

MS

Copyright ©2024 Pearson Education, Inc.

P-value

F crit


Solutions to End-of-Section and Chapter Review Problems cdlxv Between Groups

19.8745

1

19.8745

Within Groups

647.5690

113

5.7307

Total

667.4435

114

3.4681

0.0652 3.9251

Level of significance

0.05

Test statistic: FSTAT = 3.4681 Since p-value = 0.0652 > 0.05, and FSTAT = 3.4681 < 3.9251, do not reject H0. There is insufficient evidence to conclude that ages across the graduate school intentions are different.

Copyright ©2024 Pearson Education, Inc.


2. Materials (amount spent on course materials), based on graduate school intention cont. H0: 12   22 H1: At least one variance is different. Levene’s test for equality of variances: SUMMARY Groups

Count

Sum

Average

Variance

Yes

56

3517 62.80357143 2781.1607

Not Yes

59

3393 57.50847458 1786.1853

ANOVA Source of Variation

SS

df

MS

F

Between Groups

805.5454

1

805.5454

0.3548

Within Groups

256562.5850

113

2270.4654

Total

257368.1304

114

P-value

F crit

0.5526 3.9251

Level of significance

0.05

Since p-value = 0.5526 > 0.05, do not reject H0. There is not enough evidence that the equal variance assumption needed for the one-way ANOVA F test for the difference in the means is violated. Hence, the one-way ANOVA F test for the difference in the mean can be used. H0: 1  2 H1: At least one mean is different. Decision rule: df: 1,113. If F > 3.9251, reject H0. One-way ANOVA F test for difference in the means: ANOVA: Single Factor SUMMARY Groups

Count

Sum

Average

Variance

Yes

56

8515 152.0535714 5581.5062

Not Yes

59

9653 163.6101695 4902.5523

ANOVA Source of Variation

SS

df

MS

Copyright ©2024 Pearson Education, Inc.

F

P-value

F crit


Solutions to End-of-Section and Chapter Review Problems cdlxvii Between Groups

3837.0920

1

3837.0920

Within Groups

591330.8732

113

5233.0166

Total

595167.9652

114

0.7332

0.3936 3.9251

Level of significance

0.05

Test statistic: FSTAT = 0.7332 Since p-value = 0.3936 > 0.05, and FSTAT = 0.7332 < 3.9251, do not reject H0. There is insufficient evidence to conclude that amount spent on course materials across the graduate school intentions are different.

Copyright ©2024 Pearson Education, Inc.


2. Hours (total number of credit hours this semester), based on graduate school intention cont. H0: 12   22 H1: At least one variance is different. Levene’s test for equality of variances: SUMMARY Groups

Count

Sum

Average

Variance

Yes

56

249 4.446428571

6.3607

Not Yes

59

279 4.728813559

14.6493

ANOVA Source of Variation

SS

df

MS

F

Between Groups

2.2910

1

2.2910

0.2158

Within Groups

1199.5003

113

10.6150

Total

1201.7913

114

P-value

F crit

0.6431 3.9251

Level of significance

0.05

Since p-value = 0.6431 > 0.05, do not reject H0. There is not enough evidence that the equal variance assumption needed for the one-way ANOVA F test for the difference in the means is violated. Hence, the one-way ANOVA F test for the difference in the mean can be used. H0: 1  2 H1: At least one mean is different. Decision rule: df: 1,113. If F > 3.9251, reject H0. One-way ANOVA F test for difference in the means: ANOVA: Single Factor SUMMARY Groups

Count

Sum

Average

Variance

Yes

56

547 9.767857143

26.4360

Not Yes

59

577 9.779661017

32.3816

ANOVA Source of Variation

SS

df

MS

Copyright ©2024 Pearson Education, Inc.

F

P-value

F crit


Solutions to End-of-Section and Chapter Review Problems cdlxix Between Groups

0.0040

1

0.0040

Within Groups

3332.1177

113

29.4878

Total

3332.1217

114

0.0001

0.9907 3.9251

Level of significance

0.05

Test statistic: FSTAT = 0.0001 Since p-value = 0.9907 > 0.05, and FSTAT = 0.0001 < 3.9251, do not reject H0. There is insufficient evidence to conclude that the total number of credit hours this semester across the graduate school intentions are different.

Copyright ©2024 Pearson Education, Inc.


Chapter 12

This case provides practice in performing chi-square tests. As written, the three questions can be easily modified by eliminating one or more variables that the questions name. 1.

Student Status and Major H0: There is no relationship between student status and major. H1: There is a relationship between student status and major.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdlxxi

2 Since STAT = 6.8310 is lower than the critical bound of 18.3070, do not reject H0. There is not enough evidence to conclude there is a relationship between student status and major.

Copyright ©2024 Pearson Education, Inc.


1. Student Status and Graduate school intention cont. H0: There is no relationship between student status and graduate school intention. H1: There is a relationship between student status and graduate school intention. Chi-Square Test

Observed Frequencies Student Status Grad School Intention

F/T

P/T

Total

No

9

5

14

Not sure

23

22

45

Yes

26

30

56

Total

58

57

115

Expected Frequencies Student Status Grad School Intention

F/T

P/T

7.06087

6.93913

14

Not sure

22.69565 22.30435

45

Yes

28.24348 27.75652

56

No

Total

58

Total

57

115

Data Level of Significance

0.05

Number of Rows

3

Number of Columns

2

Degrees of Freedom

2

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdlxxiii Results Critical Value

5.991465

Chi-Square Test Statistic

1.442207

p-Value

0.486215

Do not reject the null hypothesis

Expected frequency assumption is met. 2 Since STAT = 1.442207 is lower than the critical bound of 5.991465, do not reject H0. There is not enough evidence to conclude there is a relationship between student status and graduate school intention.

Copyright ©2024 Pearson Education, Inc.


1. Student status and employment status cont. H0: There is no relationship between student status and employment status. H1: There is a relationship between student status and employment status. Chi-Square Test

Observed Frequencies Student Status Employment

F/T

P/T

Total

F/T

13

7

20

Not

7

12

19

P/T

38

38

76

Total

58

57

115

Expected Frequencies Student Status Employment

F/T

P/T

Total

F/T

10.08696 9.913043

20

Not

9.582609 9.417391

19

P/T

38.33043 37.66957

76

Total

58

57

115

Data Level of Significance

0.05

Number of Rows

3

Number of Columns

2

Degrees of Freedom

2

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdlxxv Results Critical Value

5.991465

Chi-Square Test Statistic

3.107329

p-Value

0.211472

Do not reject the null hypothesis

Expected frequency assumption is met. 2 Since STAT = 3.107329 is lower than the critical bound of 5.991465, do not reject H0. There is not enough evidence to conclude there is a relationship between student status and employment status.

Copyright ©2024 Pearson Education, Inc.


1. Graduate school intention and Major cont. H0: There is no relationship between graduate school intention and major. H1: There is a relationship between graduate school intention and major.

The expected frequency assumption for the  2 test is violated. The test results above might not be reliable. 2 Since STAT = 37.98281 is higher than the critical bound of 31.41043, reject H0. There is evidence to conclude there is a relationship between graduate school intention and major. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdlxxvii 1. Employment status and Major cont. H0: There is no relationship between employment status and major. H1: There is a relationship between employment status and major.

The expected frequency assumption for the  2 test is violated. The test results above might not be reliable. 2 Since STAT = 13.15548 is lower than the critical bound of 31.41043, do not reject H0. There is not enough evidence to conclude there is a relationship between employment status and major. Copyright ©2024 Pearson Education, Inc.


1. Graduate school intention and employment status cont. H0: There is no relationship between graduate school intention and employment status. H1: There is a relationship between graduate school intention and employment status. Chi-Square Test

Observed Frequencies Employment Status Grad School Intention

F/T

Not

P/T

Total

No

2

2

10

14

Not sure

7

10

28

45

Yes

11

7

38

56

Total

20

19

76

115

Expected Frequencies Employment Status Grad School Intention No

F/T

Not

P/T

Total

2.434783 2.313043 9.252174

14

Not sure 7.826087 7.434783 29.73913

45

Yes Total

9.73913 9.252174 20

19

37.0087

56

76

115

Data Level of Significance

0.05

Number of Rows

3

Number of Columns

3

Degrees of Freedom

4

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdlxxix Results Critical Value

9.487729

Chi-Square Test Statistic

1.992445

p-Value

0.737149

Do not reject the null hypothesis

Expected frequency assumption is met. 2 Since STAT = 1.992445 is lower than the critical bound of 9.487729, do not reject H0. There is not enough evidence to conclude there is a relationship between graduate school intention and employment status.

Copyright ©2024 Pearson Education, Inc.


2.

GPA: Population 1 = P/T (57), 2 = F/T (58) H0: M1 = M2 H1: M1  M2 Wilcoxon Rank Sum Test

Data Level of Significance

0.05

Population 1 Sample Sample Size

57

Sum of Ranks

3216

Population 2 Sample Sample Size

58

Sum of Ranks

3454

Intermediate Calculations Total Sample Size n

115

T1 Test Statistic

3216

T1 Mean

3306

Standard Error of T1

178.7680

Z Test Statistic

-0.503446

Two-Tail Test Lower Critical Value

-1.9600

Upper Critical Value

1.9600

p-Value

0.6147

Do not reject the null hypothesis

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdlxxxi Since p-value = 0.6147 > 0.05, do not reject H0. There is not enough evidence of any difference between full-time and part-time students in median grade point average.

Copyright ©2024 Pearson Education, Inc.


2. Expected starting salary: cont. Population 1 = P/T (57), 2 = F/T (58) H0: M1 = M2 H1: M1  M2 Wilcoxon Rank Sum Test

Data Level of Significance

0.05

Population 1 Sample Sample Size

57

Sum of Ranks

3312.5

Population 2 Sample Sample Size

58

Sum of Ranks

3357.5

Intermediate Calculations Total Sample Size n

115

T1 Test Statistic

3312.5

T1 Mean

3306

Standard Error of T1

178.7680

Z Test Statistic

0.03636

Two-Tail Test Lower Critical Value

-1.9600

Upper Critical Value

1.9600

p-Value

0.9710

Do not reject the null hypothesis

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdlxxxiii Since p-value = 0.9710 > 0.05, do not reject H0. There is not enough evidence of any difference between full-time and part-time students in median expected starting salary.

Copyright ©2024 Pearson Education, Inc.


2. Age: cont. Population 1 = P/T (57), 2 = F/T (58) H0: M1 = M2 H1: M1  M2 Wilcoxon Rank Sum Test

Data Level of Significance

0.05

Population 1 Sample Sample Size

57

Sum of Ranks

3122.5

Population 2 Sample Sample Size

58

Sum of Ranks

3547.5

Intermediate Calculations Total Sample Size n

115

T1 Test Statistic

3122.5

T1 Mean

3306

Standard Error of T1

178.7680

Z Test Statistic

-1.02647

Two-Tail Test Lower Critical Value

-1.9600

Upper Critical Value

1.9600

p-Value

0.3047

Do not reject the null hypothesis

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdlxxxv Since p-value = 0.3047 > 0.05, do not reject H0. There is not enough evidence of any difference between full-time and part-time students in median age.

Copyright ©2024 Pearson Education, Inc.


2. Spending on course materials: cont. Population 1 = P/T (57), 2 = F/T (58) H0: M1 = M2 H1: M1  M2 Wilcoxon Rank Sum Test

Data Level of Significance

0.05

Population 1 Sample Sample Size

57

Sum of Ranks

1765

Population 2 Sample Sample Size

58

Sum of Ranks

4905

Intermediate Calculations Total Sample Size n

115

T1 Test Statistic

1765

T1 Mean

3306

Standard Error of T1

178.7680

Z Test Statistic

-8.620111

Two-Tail Test Lower Critical Value

-1.9600

Upper Critical Value

1.9600

p-Value

0.0000

Reject the null hypothesis

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdlxxxvii Since p-value = 0.0000 < 0.05, reject H0. There is evidence of any difference between full-time and part-time students in median spending on course materials.

Copyright ©2024 Pearson Education, Inc.


2. Number of times visited the Student & Post-Graduate Development Center: cont. Population 1 = P/T (57), 2 = F/T (58) H0: M1 = M2 H1: M1  M2 Wilcoxon Rank Sum Test

Data Level of Significance

0.05

Population 1 Sample Sample Size

57

Sum of Ranks

1853

Population 2 Sample Sample Size

58

Sum of Ranks

4817

Intermediate Calculations Total Sample Size n

115

T1 Test Statistic

1853

T1 Mean

3306

Standard Error of T1

178.7680

Z Test Statistic

-8.127853

Two-Tail Test Lower Critical Value

-1.9600

Upper Critical Value

1.9600

p-Value

0.0000

Reject the null hypothesis

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdlxxxix Since p-value = 0.0000 < 0.05, reject H0. There is evidence of any difference between full-time and part-time students in median number of times visited the Student & Post-Graduate Development Center.

Copyright ©2024 Pearson Education, Inc.


2. Amount of current outstanding student loans: cont. Population 1 = P/T (57), 2 = F/T (58) H0: M1 = M2 H1: M1  M2 Wilcoxon Rank Sum Test

Data Level of Significance

0.05

Population 1 Sample Sample Size

57

Sum of Ranks

3168

Population 2 Sample Sample Size

58

Sum of Ranks

3502

Intermediate Calculations Total Sample Size n

115

T1 Test Statistic

3168

T1 Mean

3306

Standard Error of T1

178.7680

Z Test Statistic

-0.77195

Two-Tail Test Lower Critical Value

-1.9600

Upper Critical Value

1.9600

p-Value

0.4401

Do not reject the null hypothesis

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdxci Since p-value = 0.4401 > 0.05, do not reject H0. There is not enough evidence of any difference between full-time and part-time students in median amount of current outstanding student loans.

Copyright ©2024 Pearson Education, Inc.


3.

GPA: Population 1 = Not yes (59), 2 = Yes (56) (Graduate school intentions, where ―Not yes‖ created from ―No‖ and ―Not sure‖) H0: M1 = M2 H1: M1  M2 Wilcoxon Rank Sum Test

Data Level of Significance

0.05

Population 1 Sample Sample Size

59

Sum of Ranks

3174

Population 2 Sample Sample Size

56

Sum of Ranks

3496

Intermediate Calculations Total Sample Size n

115

T1 Test Statistic

3496

T1 Mean

3248

Standard Error of T1

178.7139

Z Test Statistic

1.3876927

Two-Tail Test Lower Critical Value

-1.9600

Upper Critical Value

1.9600

p-Value

0.1652

Do not reject the null hypothesis Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdxciii Since p-value = 0.1652 > 0.05, do not reject H0. There is not enough evidence of any difference between students who plan to go to graduate school and those who do not plan to go to graduate school in median grade point average.

Copyright ©2024 Pearson Education, Inc.


3. Expected starting salary: cont. Population 1 = Not yes (59), 2 = Yes (56) (Graduate school intentions, where ―Not yes‖ created from ―No‖ and ―Not sure‖) H0: M1 = M2 H1: M1  M2 Wilcoxon Rank Sum Test

Data Level of Significance

0.05

Population 1 Sample Sample Size

59

Sum of Ranks

3361

Population 2 Sample Sample Size

56

Sum of Ranks

3309

Intermediate Calculations Total Sample Size n

115

T1 Test Statistic

3309

T1 Mean

3248

Standard Error of T1

178.7139

Z Test Statistic

0.3413276

Two-Tail Test Lower Critical Value

-1.9600

Upper Critical Value

1.9600

p-Value

0.7329

Do not reject the null hypothesis Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdxcv Since p-value = 0.7329 > 0.05, do not reject H0. There is not enough evidence of any difference between students who plan to go to graduate school and those who do not plan to go to graduate school in median expected starting salary.

Copyright ©2024 Pearson Education, Inc.


3. Age: cont. Population 1 = Not yes (59), 2 = Yes (56) (Graduate school intentions, where ―Not yes‖ created from ―No‖ and ―Not sure‖) H0: M1 = M2 H1: M1  M2 Wilcoxon Rank Sum Test

Data Level of Significance

0.05

Population 1 Sample Sample Size

59

Sum of Ranks

3202.5

Population 2 Sample Sample Size

56

Sum of Ranks

3467.5

Intermediate Calculations Total Sample Size n

115

T1 Test Statistic

3467.5

T1 Mean

3248

Standard Error of T1

178.7139

Z Test Statistic

1.2282199

Two-Tail Test Lower Critical Value

-1.9600

Upper Critical Value

1.9600

p-Value

0.2194

Do not reject the null hypothesis Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdxcvii Since p-value = 0.2194 > 0.05, do not reject H0. There is not enough evidence of any difference between students who plan to go to graduate school and those who do not plan to go to graduate school in median age.

Copyright ©2024 Pearson Education, Inc.


3. Spending for course materials: cont. Population 1 = Not yes (59), 2 = Yes (56) (Graduate school intentions, where ―Not yes‖ created from ―No‖ and ―Not sure‖) H0: M1 = M2 H1: M1  M2 Wilcoxon Rank Sum Test

Data Level of Significance

0.05

Population 1 Sample Sample Size

59

Sum of Ranks

3610

Population 2 Sample Sample Size

56

Sum of Ranks

3060

Intermediate Calculations Total Sample Size n

115

T1 Test Statistic

3060

T1 Mean

3248

Standard Error of T1

178.7139

Z Test Statistic

-1.051961

Two-Tail Test Lower Critical Value

-1.9600

Upper Critical Value

1.9600

p-Value

0.2928

Do not reject the null hypothesis Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems cdxcix Since p-value = 0.2928 > 0.05, do not reject H0. There is not enough evidence of any difference between students who plan to go to graduate school and those who do not plan to go to graduate school in median spending for course materials.

The Shelter Bay Lifestyles Case

Chapter 1

1.

2.

3.

4.

Gender is not well-defined. Suggestions could include: Man, Woman, Both, Neither, Other. Or a free response gender might also be considered, even though ―free response‖ is an acceptable definition (although one that may not be amenable to data analysis). Annual household income might need more specification, such as thousands of dollars. This somewhat depends on how the survey is administered. If online, through choices offered on screens, then annual household income may need some recoding (because one might write $30K while another might write 30,000), and as might gender (to eliminate variations of the same category). Product purchased: categorical, nominal, EX-10, EX-11, AIX-12; store: categorical, nominal, store identifier; years as customer: numerical, discrete, whole number; age: categorical ordinal, under 18, 18-34, 35-54, 55 or older; gender: categorical, nominal, free response; education: categorical, ordinal, high school or lower, college graduate, master’s degree, doctorate; relationship status: categorical nominal, single, partnered, separated; income; numerical, discrete, ratio, a nonnegative a dollar and cents number; ZIP code, categorical, nominal, five digits, planned use: numerical, ratio whole number; planned miles: numerical, ratio; self-rated fitness: categorical, ordinal, 1, 2, 3, 4, or 5. The data source should be a probability sample.

Copyright ©2024 Pearson Education, Inc.


Chapter 2

1.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems di

1. cont.

Copyright ©2024 Pearson Education, Inc.


1. cont.

EX-10: Years as Customer

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems diii

Copyright ©2024 Pearson Education, Inc.


1. cont. EX-10: Annual Household Income

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dv 1. cont. EX-10: Weekly Usage

EX-10: Elliptical Miles

EX-10: Fitness

Copyright ©2024 Pearson Education, Inc.


1. cont. EX-11: Years as Customer

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dvii 1. EX-11: Annual Household Income cont.

Copyright ©2024 Pearson Education, Inc.


1. EX-11: Weekly Usage cont.

EX-11: Elliptical Miles

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dix 1. EX-11: Fitness cont.

AIX-12: Years as Customer

AIX-12: Annual Household Income

Copyright ©2024 Pearson Education, Inc.


1. AIX-12: Weekly Usage cont.

AIX-12: Elliptical Miles

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dxi 1. AIX-12: Fitness cont.

2.

EX-10 Web store sold 32, followed by Oxford Glen with 19, Springville with 13, Ashland with 12, and Galleria with just 4. EX-10 was preferred by the 18-34 age group, women, people with a master’s degree, and whose relationship status is partnered. The average years as an EX-10 customer is 3.26 with a median of 3 years. The average annual household income of an EX-10 customer is $46,418, with a median of $46,617, although the minimum is $29,600 and the maximum is $68,200. The average weekly usage of the EX-10 is 3.0875 with a median of 3. The average number of elliptical miles of the EX-10 is 82.71 with a median of 84.6. The average level of fitness is 2.96 with a median of 3.00. EX-11 Oxford Glen store sold 21, followed by Web with 15, Ashland with 12, Springville with 7, and Galleria with just 5. EX-11 was preferred by the 18-34 age group, women, people with a master’s degree, and whose relationship status is partnered. The average years as an EX-11 customer is 3.17 with a median of 2.95 years. The average annual household income of an EX-11 customer is $48.974, with a median of $49,460, although the minimum is $31,800 and the maximum is $67,100. The average weekly usage of the EX-11 is 3.0667 with a median of 3. The average number of elliptical miles of the EX-11 is 87.98 with a median of 84.8. The average level of fitness is 2.90 with a median of 3.00. AIX-12 Web store sold 14, followed by Oxford Glen with 10, Ashland with 8, Springville with 5, and Galleria with just 3. AIX-12 was preferred by the 18-34 age group, men, people with a master’s degree, and whose relationship status is partnered. The average years as an AIX-12 customer is 3.57 with a median of 3.25 years. The average annual household income of an AIX-12 customer is $75,442, with a median of $76,589, although the minimum is $49,000 and the maximum is $105,000. The average weekly usage of the AIX-12 is 4.775 with a median of 5. The average number of elliptical miles of the AIX-12 is 166.9 with a median of 160, although the minimum is 80 and the maximum is 360. The average level of fitness is 4.625 with a median of 5.000.

Copyright ©2024 Pearson Education, Inc.


Chapter 3

1.

EX-10 descriptive summary: Years As Customer

Annual Household Income

Elliptical Miles

Fitness

3.26375

46418.025

3.0875

82.71

2.9625

Median

3

46617

3

84.6

3

Mode

2.1

46617

3

84.6

3

Minimum

1.4

29562

2

37.6

1

Maximum

6.8

68220

5

188

5

Range

5.4

38658

3

150.4

4

Variance

2.0596

82369840.5057

0.6125

832.4434

0.4416

Standard Deviation

1.4351

9075.7832

0.7826

28.8521

0.6645

Coeff. of Variation

43.97%

19.55%

25.35%

34.88%

22.43%

Skewness

0.5641

0.1766

0.1691

1.0153

0.3065

Kurtosis

-0.6319

-0.6213

-0.6205

1.8517

1.9068

80

80

80

80

80

Standard Error

0.1605

1014.7034

0.0875

3.2258

0.0743

First Quartile

2

38658

3

65.8

3

Third Quartile

4.3

53439

4

94

3

EX-11 descriptive summary: Years As Customer

Annual Household Income

Elliptical Miles

Fitness

Mean

Count

Weekly Usage

Weekly Usage

Mean

3.17

48973.65 3.066666667

87.98

2.9

Median

2.95

49459.5

84.8

3

Copyright ©2024 Pearson Education, Inc.

3


Solutions to End-of-Section and Chapter Review Problems dxiii Mode

1.8

45480

3

95.4

3

Minimum

1.3

31836

2

21.2

1

Maximum

8.6

67083

5

212

4

Range

7.3

35247

3

190.8

3

Variance

2.6428

74891532.3331

0.6395

1105.6986

0.3966

Standard Deviation

1.6257

8653.9894

0.7997

33.2520

0.6298

Coeff. of Variation

51.28%

17.67%

26.08%

37.80%

21.72%

Skewness

1.1797

-0.0105

0.4949

1.0859

-0.3454

Kurtosis

1.2182

-0.3250

0.0132

2.7948

0.7326

60

60

60

60

60

Standard Error

0.2099

1117.2252

0.1032

4.2928

0.0813

First Quartile

1.8

43206

3

63.6

3

Third Quartile

4

53439

4

106

3

Count

Copyright ©2024 Pearson Education, Inc.


1. AIX-12 descriptive summary: cont. Years As Customer

Weekly Usage

Elliptical Miles

Fitness

Mean

3.5675

75441.575

4.775

166.9

4.625

Median

3.25

76568.5

5

160

5

Mode

3.5

90886

4

100

5

Minimum

1.2

48556

3

80

3

Maximum

8.1

104581

7

360

5

Range

6.9

56025

4

280

2

Variance

3.0058

342465992.7122

0.8968

3607.9897

0.4455

Standard Deviation

1.7337

18505.8367

0.9470

60.0665

0.6675

Coeff. of Variation

48.60%

24.53%

19.83%

35.99%

14.43%

Skewness

0.8251

-0.0796

0.6694

1.1340

-1.5742

Kurtosis

0.2614

-1.4521

-0.2245

1.8213

1.2049

40

40

40

40

40

Standard Error

0.2741

2926.0297

0.1497

9.4974

0.1055

First Quartile

2.3

57271

4

120

4

Third Quartile

4.3

90886

5

200

5

Count

2.

Annual Household Income

EX-10 Half of the purchasers of EX-10 have been customers for 3 years or less with a mean of 3.26 years. The mean spread around the mean years as a customer is 1.44 with the least years at 1.4 and the most at 6.8. The middle 50% of the customers fall between 2 and 4.3 years as a customer. Years as a customer is right-skewed. Half of the customers have an annual household income of $46,617 or less with a mean annual income of $46,418. The mean spread around the mean annual income is $9,076 with a minimum income of $29,562 and a maximum income of $68,220. The middle 50% of the customers have an annual income that falls between $38,658 and $53,439. The annual household income is almost symmetrical. Half of the customers of EX-10 have a usage value below 3 with a mean of 3.09. The mean spread around the mean usage is 0.783 with the lowest value of 2 and the highest of 5. The middle 50% of the usage value falls between 3 and 4. The usage value is nearly symmetrical. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dxv Half of the customers of EX-10 expect to have less than 84.6 elliptical miles each week with a mean average elliptical miles of 82.71. The mean spread around the mean average miles per week is 28.85 with the lowest value of 37.6 miles and the highest value of 188 miles. The middle 50% of the customers expect to use the elliptical an average between 65.8 miles and 94 miles per week. The average number of miles the customer expects to use the elliptical each week is left skewed. Half of the fitness value fall below 3 with a mean value of 2.96. The mean spread around the mean fitness value is 0.6645 with the lowest value of 1 and the highest of 5. The middle 50% of the fitness value is equal to 3. The fitness value is symmetrical.

Copyright ©2024 Pearson Education, Inc.


2. EX-11 cont. Half of the purchasers of EX-11 have been customers for 2.95 years or less with a mean of 3.17 years. The mean spread around the mean years as a customer is 1.63 with the least years at 1.3 and the most at 8.6. The middle 50% of the customers fall between 1.8 and 4 years as a customer. Years as a customer is right-skewed. Half of the customers have an annual household income of $49,459.50 or less with a mean annual income of $48,974. The mean spread around the mean annual income is $8,654 with a minimum income of $31,836 and a maximum income of $67,083. The middle 50% of the customers have an annual income that falls between $43,206 and $53,439. The annual household income is leftskewed. Half of the customers of EX-11 have a usage value below 3 with a mean of 3.067. The mean spread around the mean usage is 0.80 with the lowest value of 2 and the highest of 5. The middle 50% of the usage value falls between 3 and 4. The usage value is nearly symmetrical. Half of the customers of EX-11 expect to have less than 84.8 elliptical miles each week with a mean average elliptical miles of 87.98. The mean spread around the mean average miles per week is 33.25 with the lowest value of 21.2 miles and the highest value of 212 miles. The middle 50% of the customers expect to use the elliptical an average between 63.6 miles and 106 miles per week. The average number of miles the customer expects to use the elliptical each week is right skewed. Half of the fitness value fall below 3 with a mean value of 2.9. The mean spread around the mean fitness value is 0.63 with the lowest value of 1 and the highest of 4. The middle 50% of the fitness value is equal to 3. The fitness value is left-skewed. AIX-12 Half of the purchasers of AIX-12 have been customers for 3.25 years or less with a mean of 3.57 years. The mean spread around the mean years as a customer is 1.73 with the least years at 1.2 and the most at 8.1. The middle 50% of the customers fall between 2.3 and 4.3 years as a customer. Years as a customer is right-skewed. Half of the customers have an annual household income of $76,568.50 or less with a mean annual income of $75,442. The mean spread around the mean annual income is $18,506 with a minimum income of $48,556 and a maximum income of $104,581. The middle 50% of the customers have an annual income that falls between $57,271 and $90,886. The annual household income is almost symmetrical. Half of the customers of AIX-12 have a usage value below 5 with a mean of 4.775. The mean spread around the mean usage is 0.947 with the lowest value of 3 and the highest of 7. The middle 50% of the usage value falls between 4 and 5. The usage value is left-skewed. Half of the customers of AIX-12 expect to have less than 160 elliptical miles each week with a mean average elliptical miles of 166.9. The mean spread around the mean average miles per week is 60.07 with the lowest value of 80 miles and the highest value of 360 miles. The middle 50% of the customers expect to use the elliptical an average between 120 miles and 200 miles per week. The average number of miles the customer expects to use the elliptical each week is right skewed. Half of the fitness value fall below 5 with a mean value of 4.625. The mean spread around the mean fitness value is 0.6675 with the lowest value of 3 and the highest of 5. The middle 50% of the fitness value are between 4 and 5. The fitness value is left-skewed.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dxvii

Chapter 4

All the contingency tables are given here for EX-10. Similar steps can be made for EX-11 and AIX-12. 1.

EX-10 Gender, Education Count of Highest Level of Education

Education

Gender

college graduate

high school or lower

Cis Man Man

2

3

1

17

19

4

2

6

3

3

7

2

1

3

1

1

8

14

24

11

6

17

30

46

80

Non-binary 1

Trans Trans Woman Woman

2

(blank) Grand Total

4

Grand Total

1 1

Prefer not to say

master’s degree

EX-10 Gender, Relationship Count of Relationship Status

Relationship

Gender

Partnered

Single

Grand Total

Cis Man

2

1

3

Man

10

9

19

Non-binary

3

3

6

Prefer not to say

3

4

7

Trans

3 Copyright ©2024 Pearson Education, Inc.

3


Trans Woman

1

1

Woman

16

8

24

(blank)

10

7

17

Grand Total

48

32

80

EX-10 Gender, Fitness Count of Fitness Gender

Fitness 1

2

Cis Man Man

3

4

3 1

Non-binary

Trans

3

2

13

2

1

4

1

Prefer not to say 2

19 6 7

1

3 1

Woman

6

16

1

(blank)

3

10

4

14

54

9

1

1. EX-10 Degree, Relationship cont. Count of Highest Level of Education Education

1

7

Trans Woman

Grand Total

5 Grand Total

1 1

24 17

2

80

Relationship Partnered

Single

Grand Total

college graduate

2

2

4

high school or lower

18

12

30

Master’s degree

28

18

46

Grand Total

48

32

80

EX-10 Degree, Fitness Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dxix Count of Highest Level of Education

Fitness

Education

1

2

3

4

5 Grand Total

college graduate

1

2

1

4

high school or lower

6

20

4

30

Master’s degree

1

7

32

4

2

46

Grand Total

1

14

54

9

2

80

EX-10 Relationship, Fitness Count of Relationship Status

Fitness

Relationship

1

2

3

4

5 Grand Total

Partnered

1

11

31

4

1

48

3

23

5

1

32

14

54

9

2

80

Single Grand Total

1

Copyright ©2024 Pearson Education, Inc.


NOTE: All the conditional and marginal probabilities are given here for EX-10. Similar steps can be made for EX-11 and AIX-12. 2.

EX-10 Gender, Education Joint and Marginal Probabilities

P(Gender, Education)

Education

Gender

college graduate

high school or lower

Master’s degree

P(Education)

Cis Man

0.0000

0.0125

0.0250

0.0375

Man

0.0125

0.0125

0.2125

0.2375

Non-binary

0.0000

0.0500

0.0250

0.0750

Prefer not to say

0.0125

0.0375

0.0375

0.0875

Trans

0.0000

0.0250

0.0125

0.0375

Trans Woman

0.0000

0.0000

0.0125

0.0125

Woman

0.0250

0.1000

0.1750

0.3000

(blank)

0.0000

0.1375

0.0750

0.2125

P(Gender)

0.0500

0.3750

0.5750

1.0000

EX-10 Gender, Education Conditional

Probabilities

P(Education|Gender)

Education

Gender

college graduate

high school or lower

Master’s degree

Cis Man

0.0000

0.3333

0.6667

Man

0.0526

0.0526

0.8947

Non-binary

0.0000

0.6667

0.3333

Prefer not to say

0.1429

0.4286

0.4286

Trans

0.0000

0.6667

0.3333

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dxxi Trans Woman

0.0000

0.0000

1.0000

Woman

0.0833

0.3333

0.5833

(blank)

0.0000

0.6471

0.3529

EX-10 Gender, Education Conditional

Probabilities

P(Gender|Education)

Education

Gender

college graduate

high school or lower

Master’s degree

Cis Man

0.0000

0.0333

0.0435

Man

0.2500

0.0333

0.3696

Non-binary

0.0000

0.1333

0.0435

Prefer not to say

0.2500

0.1000

0.0652

Trans

0.0000

0.0667

0.0217

Trans Woman

0.0000

0.0000

0.0217

Woman

0.5000

0.2667

0.3043

(blank)

0.0000

0.3667

0.1304

Copyright ©2024 Pearson Education, Inc.


2. EX-10 Gender, Relationship cont. Joint and Marginal Probabilities P(Gender, Relationship)

Relationship

Gender

Partnered

Single

P(Relationship)

Cis Man

0.0250

0.0125

0.0375

Man

0.1250

0.1125

0.2375

Non-binary

0.0375

0.0375

0.0750

Prefer not to say

0.0375

0.0500

0.0875

Trans

0.0375

0.0000

0.0375

Trans Woman

0.0125

0.0000

0.0125

Woman

0.2000

0.1000

0.3000

(blank)

0.1250

0.0875

0.2125

P(Gender)

0.6000

0.4000

1.0000

EX-10 Gender, Relationship Conditional Probabilities P(Relationship|Gender)

Relationship

Gender

Partnered

Single

Cis Man

0.6667

0.3333

Man

0.5263

0.4737

Non-binary

0.5000

0.5000

Prefer not to say

0.4286

0.5714

Trans

1.0000

0.0000

Trans Woman

1.0000

0.0000

Woman

0.6667

0.3333

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dxxiii (blank)

0.5882

0.4118

EX-10 Gender, Relationship Conditional Probabilities P(Gender|Relationship)

Relationship

Gender

Partnered

Single

Cis Man

0.0417

0.0313

Man

0.2083

0.2813

Non-binary

0.0625

0.0938

Prefer not to say

0.0625

0.1250

Trans

0.0625

0.0000

Trans Woman

0.0208

0.0000

Woman

0.3333

0.2500

(blank)

0.2083

0.2188

Copyright ©2024 Pearson Education, Inc.


2. EX-10 Gender, Fitness cont. Joint and Marginal Probabilities P(Gender, Fitness)

Fitness

Gender

1

2

3

4

5 P(Fitness)

Cis Man

0.0000

0.0000

0.0375

0.0000

0.0000

0.0375

Man

0.0125

0.0250

0.1625

0.0250

0.0125

0.2375

Non-binary

0.0000

0.0125

0.0500

0.0125

0.0000

0.0750

Prefer not to say

0.0000

0.0000

0.0875

0.0000

0.0000

0.0875

Trans

0.0000

0.0250

0.0125

0.0000

0.0000

0.0375

Trans Woman

0.0000

0.0000

0.0000

0.0125

0.0000

0.0125

Woman

0.0000

0.0750

0.2000

0.0125

0.0125

0.3000

(blank)

0.0000

0.0375

0.1250

0.0500

0.0000

0.2125

P(Gender)

0.0125

0.1750

0.6750

0.1125

0.0250

1.0000

EX-10 Gender, Fitness Conditional Probabilities P(Fitness|Gender) Fitness Gender

1

2

3

4

5

Cis Man

0.0000

0.0000

1.0000

0.0000

0.0000

Man

0.0526

0.1053

0.6842

0.1053

0.0526

Non-binary

0.0000

0.1667

0.6667

0.1667

0.0000

Prefer not to say

0.0000

0.0000

1.0000

0.0000

0.0000

Trans

0.0000

0.6667

0.3333

0.0000

0.0000

Trans Woman

0.0000

0.0000

0.0000

1.0000

0.0000

Woman

0.0000

0.2500

0.6667

0.0417

0.0417

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dxxv (blank)

0.0000

0.1765

0.5882

0.2353

0.0000

EX-10 Gender, Fitness Conditional Probabilities P(Gender|Fitness) Fitness Gender

1

2

3

4

5

Cis Man

0.0000

0.0000

0.0556

0.0000

0.0556

Man

1.0000

0.1429

0.2407

0.2222

0.2407

Non-binary

0.0000

0.0714

0.0741

0.1111

0.0741

Prefer not to say

0.0000

0.0000

0.1296

0.0000

0.1296

Trans

0.0000

0.1429

0.0185

0.0000

0.0185

Trans Woman

0.0000

0.0000

0.0000

0.1111

0.0000

Woman

0.0000

0.4286

0.2963

0.1111

0.2963

(blank)

0.0000

0.2143

0.1852

0.4444

0.1852

Copyright ©2024 Pearson Education, Inc.


1. EX-10 Degree, Relationship cont. Joint and Marginal Probabilities P(Education, Relationship)

Relationship

Education

Partnered

Single

P(Relationship)

college graduate

0.0250

0.0250

0.0500

high school or lower

0.2250

0.1500

0.3750

Master’s degree

0.3500

0.2250

0.5750

P(Education)

0.6000

0.4000

1.0000

EX-10 Degree, Relationship Conditional Probabilities P(Relationship|Education)

Relationship

Education

Partnered

Single

college graduate

0.5000

0.5000

high school or lower

0.6000

0.4000

Master’s degree

0.6087

0.3913

EX-10 Degree, Relationship Conditional Probabilities P(Education|Relationship)

Relationship

Education

Partnered

Single

college graduate

0.0417

0.0625

high school or lower

0.3750

0.3750

Master’s degree

0.5833

0.5625

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dxxvii 2. EX-10 Degree, Fitness cont. Joint and Marginal Probabilities P(Education, Fitness) Education

Fitness 1

2

3

4

5 P(Fitness)

college graduate

0.0000

0.0125

0.0250

0.0125

0.0000

0.05

high school or lower

0.0000

0.0750

0.2500

0.0500

0.0000

0.375

Master’s degree

0.0125

0.0875

0.4000

0.0500

0.0250

0.575

P(Education)

0.0125

0.1750

0.6750

0.1125

0.0250

1.0000

EX-10 Degree, Fitness Conditional Probabilities P(Fitness|Education)

Fitness

Education

1

2

3

4

5

college graduate

0.0000

0.2500

0.5000

0.2500

0.0000

high school or lower

0.0000

0.2000

0.6667

0.1333

0.0000

Master’s degree

0.0000

0.2000

0.6667

0.1333

0.0000

EX-10 Degree, Fitness Conditional Probabilities P(Education|Fitness) Education

Fitness 1

2

3

4

5

college graduate

0.0000

0.0714

0.0370

0.1111

0.0000

high school or lower

0.0000

0.4286

0.3704

0.4444

0.0000

Master’s degree

1.0000

0.5000

0.5926

0.4444

1.0000

Copyright ©2024 Pearson Education, Inc.


2. EX-10 Relationship, Fitness cont. Joint and Marginal Probabilities P(Relationship, Fitness)

Fitness

Relationship

1

2

3

4

5 P(Fitness)

Partnered

0.0125

0.1375

0.3875 0.0500

0.0125

0.6

Single

0.0000

0.0375

0.2875 0.0625

0.0125

0.4

P(Education)

0.0125

0.1750

0.6750 0.1125

0.0250

1.0000

EX-10 Relationship, Fitness Conditional P(Fitness|Relationship)

Probabilities

Fitness

Relationship

1

2

Partnered

0.0208

Single

3

4

5

0.2292

0.6458 0.0833

0.0208

0.0000

0.0938

0.7188 0.1563

0.0313

Conditional

Probabilities

EX-10 Relationship, Fitness

P(Relationship|Fitness) Relationship

Fitness 1

2

Partnered

1.0000

Single

0.0000

3

4

5

0.7857

0.5741 0.4444

0.5000

0.2143

0.4259 0.5556

0.5000

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dxxix

Chapter 6

1.

EX-10, Age Age cannot be approximated by the normal distribution because age is defined as a categorical variable. EX-10, Income

Copyright ©2024 Pearson Education, Inc.


Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dxxxi 1. EX-10, Usage cont.

Copyright ©2024 Pearson Education, Inc.


1. EX-10, Miles cont.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dxxxiii 1. EX-11, Age cont. Age cannot be approximated by the normal distribution because age is defined as a categorical variable. EX-11, Income

Copyright ©2024 Pearson Education, Inc.


1. EX-11, Usage cont.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dxxxv 1. EX-11, Miles cont.

Copyright ©2024 Pearson Education, Inc.


1. AIX-12, Age cont. Age cannot be approximated by the normal distribution because age is defined as a categorical variable. AIX-12, Income

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dxxxvii 1. AIX-12, Usage cont.

Copyright ©2024 Pearson Education, Inc.


1. AIX-12, Miles cont.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dxxxix 2.

EX-10: Age is defined as a categorical variable. Income appears approximately normally distributed. Usage is a discrete variable with only 4 different values and, hence, is not normally distributed. The number of elliptical miles appears to be right-skewed. EX-11: Age is defined as a categorical variable. Income appears approximately normally distributed. Usage is a discrete variable with only 4 different values and, hence, is not normally distributed. The number of elliptical miles appears to be right-skewed. AIX-12: Age is defined as a categorical variable. Income appears to depart slightly from the normal distribution. Usage is a discrete variable with only 5 different values and, hence, is not normally distributed. The number of elliptical miles appears to be right-skewed.

Copyright ©2024 Pearson Education, Inc.


Chapter 8

1.

EX-10 Confidence Interval Estimate for the Mean: 95% confidence interval Years:2.94    3.58 Income:44,398.31    48,437.74 Usage:2.91    3.26 Miles:76.29    89.13 Fitness:2.81    3.11 Confidence Interval Estimate for the Proportion: 95% confidence interval Store (sample size = 80) Ashland (12):0.0718    0.2282 Galleria (4): 0.0022    0.0978 Oxford Glen (19):0.1442    0.3308 Springville (13):0.0817    0.2433 Web (32):0.2926    0.5074 Age Group (sample size = 80) 18-34 (63):0.6979    0.8771 35-55 (17):0.1229    0.3021 Gender (sample size = 80) Cis Man (3):–0.0041    0.0791 Man (19):0.1442    0.3308 Non-binary (6):0.0173    0.1327 Prefer not to say (7):0.0256    0.1494 Trans (3):–0.0041    0.0791 Trans Woman (1):–0.0118    0.0368 Woman (24): 0.1996    0.4004 (blank) (17):0.1229    0.3021 Highest Level of Education (sample size = 80) high school or lower (30):0.2689    0.4811 college graduate (4):0.0022    0.0978 master’s degree (46):0.4667    0.6833 Relationship Status (sample size = 80) Single (32):0.2926    0.5074 Partnered (48):0.4926    0.7074

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dxli 1. EX-11 cont. Confidence Interval Estimate for the Mean: 95% confidence interval Years:2.75    3.59 Income:46,738.09    51,209.21 Usage:2.86    3.27 Miles:79.39    96.57 Fitness:2.74    3.06 Confidence Interval Estimate for the Proportion: 95% confidence interval Store (sample size = 60) Ashland (12):0.0988    0.3012 Galleria (5): 0.0134    0.1533 Oxford Glen (21):0.2293    0.4707 Springville (7):0.0354    0.1979 Web (15):0.1404    0.3596 Age Group (sample size = 60) 18-34 (48):0.6988    0.9012 35-55 (12):0.0988    0.3012 Gender (sample size = 60) Cis Man (3):–0.0051    0.1051 Man (14):0.1263    0.3404 Non-binary (6):0.0241    0.1759 Prefer not to say (5):0.0134    0.1533 Trans (3):–0.0051    0.1051 Woman (21): 0.2293    0.4707 (blank) (8):0.0473    0.2193 Highest Level of Education (sample size = 60) high school or lower (23):0.2603    0.5064 college graduate (1):–0.0157    0.0491 master’s degree (36):0.4760    0.7240 Relationship Status (sample size = 60) Single (24):0.2760    0.5240 Partnered (36):0.4760    0.7240

Copyright ©2024 Pearson Education, Inc.


1. AIX-12 cont. Confidence Interval Estimate for the Mean: 95% confidence interval Years:3.01    4.12 Income:69,523.12    81,360.03 Usage:4.47    5.08 Miles:147.69    186.11 Fitness:4.41    4.84 Confidence Interval Estimate for the Proportion: 95% confidence interval Store (sample size = 40) Ashland (8):0.0760    0.3240 Galleria (3): –0.0066    0.1566 Oxford Glen (10):0.1158    0.3842 Springville (5):0.0225    0.2275 Web (14):0.2022    0.4978 Age Group (sample size = 40) 18-34 (33):0.7072    0.9428 35-55 (7):0.0572    0.2928 Gender (sample size = 40) Cis Man (1):–0.0234    0.0734 Cis Woman (1):–0.0234    0.0734 Man (17):0.2718    0.5782 Non-binary (3):–0.0066    0.1566 Prefer not to say (5):0.0225    0.2275 Trans (6):0.0393    0.2607 Woman (4): 0.0070    0.1930 (blank) (3):–0.0066    0.1566 Highest Level of Education (sample size = 40) high school or lower (2):–0.0175    0.1175 doctorate (4):0.0070    0.1930 master’s degree (34):0.7393    0.9607 Relationship Status (sample size = 40) Single (17):0.2718    0.5782 Partnered (23):0.4218    0.7282

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dxliii 2.

EX-10 You are 95% confident that the mean years as customer is between 2.94 and 3.58. You are 95% confident that the mean annual household income of the customers is between $44,398.31 and $48,437.74. You are 95% confident that the mean average number of times the customer plans to use the elliptical each week is between 2.91 and 3.26. You are 95% confident that the mean average number of miles the customer expects to use the elliptical each week is between 76.29 and 89.13. You are 95% confident that the mean level of fitness is between 2.81and 3.11. You are 95% confident that the proportion of the store: Ashland is between 0.0718 and 0.2282, Galleria is between 0.0022 and 0.0978, Oxford Glen is between 0.1442 and 0.3308, Springville is between 0.0817 and 0.2433, and Web 0.2926 and 0.5074. You are 95% confident that the proportion of the age group: 18-34 is between 0.6979 and 0.8771, and 35-55 is between 0.1229 and 0.3021. You are 95% confident that the proportion of the gender: Cis Man is between –0.0041 and 0.0791, Man is between 0.1442 and 0.3308, Non-binary is between 0.0173 and 0.1327, Prefer not to say is between 0.0256 and 0.1494, Trans is between –0.0041 and 0.0791, Trans woman is between –0.0118 and 0.0368, Woman is between 0.1996 and 0.4004, and (blank) is between 0.1229 and 0.3021. You are 95% confident that the proportion of highest education: high school or lower is between 0.2689 and 0.4811, college graduate is between 0.0022 and 0.0978, and master’s degree is between 0.4667 and 0.6833. You are 95% confident that the proportion of the relationship status: single is between 0.2926 and 0.5074, and partnered is between 0.4926 and 0.7074. EX-11 You are 95% confident that the mean years as customer is between 2.75 and 3.59. You are 95% confident that the mean annual household income of the customers is between $46,738.09 and $51,209.21. You are 95% confident that the mean average number of times the customer plans to use the elliptical each week is between 2.86 and 3.27. You are 95% confident that the mean average number of miles the customer expects to use the elliptical each week is between 79.39 and 96.57. You are 95% confident that the mean level of fitness is between 2.74and 3.06. You are 95% confident that the proportion of the store: Ashland is between 0.0988 and 0.3012, Galleria is between 0.0134 and 0.1533, Oxford Glen is between 0.2293 and 0.4707, Springville is between 0.0354 and 0.1979, and Web 0.1404 and 0.3596. You are 95% confident that the proportion of the age group: 18-34 is between 0.6988 and 0.9012, and 35-55 is between 0.0988 and 0.3012. You are 95% confident that the proportion of the gender: Cis Man is between –0.0051 and 0.1051, Man is between 0.1263 and 0.3404, Non-binary is between 0.0241 and 0.1759, Prefer not to say is between 0.0134 and 0.1533, Trans is between –0.0051 and 0.1051, Woman is between 0.2293 and 0.4707, and (blank) is between 0.0473 and 0.2193. You are 95% confident that the proportion of highest education: high school or lower is between 0.2603 and 0.5064, college graduate is between –0.0157 and 0.0491, and master’s degree is between 0.4760 and 0.7240. You are 95% confident that the proportion of the relationship status: single is between 0.2760 and 0.5240, and partnered is between 0.4760 and 0.7240.

Copyright ©2024 Pearson Education, Inc.


2. AIX-12 cont. You are 95% confident that the mean years as customer is between 3.01 and 4.12. You are 95% confident that the mean annual household income of the customers is between $69,523.12 and $81,360.03. You are 95% confident that the mean average number of times the customer plans to use the elliptical each week is between 4.47 and 5.08. You are 95% confident that the mean average number of miles the customer expects to use the elliptical each week is between 147.69 and 186.11. You are 95% confident that the mean level of fitness is between 4.41and 4.84. You are 95% confident that the proportion of the store: Ashland is between 0.0760 and 0.3240, Galleria is between –0.0066 and 0.1566, Oxford Glen is between 0.1158 and 0.3842, Springville is between 0.0225 and 0.2275, and Web 0.2022 and 0.4978. You are 95% confident that the proportion of the age group: 18-34 is between 0.7072 and 0.9428, and 35-55 is between 0.0572 and 0.2928. You are 95% confident that the proportion of the gender: Cis Man is between –0.0234 and 0.0734, Cis Woman is between –0.0234 and 0.0734, Man is between 0.2718 and 0.5782, Non-binary is between –0.0066 and 0.1566, Prefer not to say is between 0.0225 and 0.2275, Trans is between 0.0393 and 0.2607, Woman is between 0.0070 and 0.1930, and (blank) is between –0.0066 and 0.1566. You are 95% confident that the proportion of highest education: high school or lower is between –0.0175 and 0.1175, doctorate is between 0.0070 and 0.1930, and master’s degree is between 0.7393 and 0.9607. You are 95% confident that the proportion of the relationship status: single is between 0.2718 and 0.5782, and partnered is between 0.4218 and 0.7282.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dxlv

Chapter 10

1.

Years as a Customer:

Population 1 = graduate degree (120), 2 = not (60)H0: 12   22 ; H1: 12   22 F Test for Differences in Two Variances Data Level of Significance

0.05

Larger-Variance Sample Sample Size

120

Sample Variance

2.735394958

Smaller-Variance Sample Sample Size

60

Sample Variance

1.459389831

Intermediate Calculations F Test Statistic

1.8743 Copyright ©2024 Pearson Education, Inc.


Population 1 Sample Degrees of Freedom

119

Population 2 Sample Degrees of Freedom

59

Two-Tail Test Upper Critical Value

1.5867

p-Value

0.0082 Reject the null hypothesis

S12 = 1.8743 S22 Decision: Since FSTAT = 1.8743 > 1.5867 and the p-value < 0.05, reject H0. There is enough evidence to conclude that the two population variances are different. Hence, the appropriate test for the difference in two means is the separate-variance t test. Decision rule: If FSTAT > 1.5867, reject H0.Test statistic: FSTAT 

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dxlvii 1. Years as a Customer: cont. Population 1 = graduate degree (120), 2 = not (60)H0: 1  2 H1: 1  2 Separate-Variances t Test for the Difference Between Two Means (assumes unequal population variances) Data Hypothesized Difference

0

Level of Significance

0.05

Population 1 Sample Sample Size

120

Sample Mean

3.58

Sample Standard Deviation

1.6539

Population 2 Sample Sample Size

60

Sample Mean

2.74

Sample Standard Deviation

1.2081

Intermediate Calculations Numerator of Degrees of Freedom

0.0022

Denominator of Degrees of Freedom

0.0000

Total Degrees of Freedom

154.2405

Degrees of Freedom

154

Standard Error

0.2171

Difference in Sample Means

0.8400

Separate-Variance t Test Statistic

3.8698

Copyright ©2024 Pearson Education, Inc.


Two-Tail Test Lower Critical Value

-1.9755

Upper Critical Value

1.9755

p-Value

0.0002 Reject the null hypothesis

Decision: Since the p-value < 0.05, reject H0. There is sufficient evidence to conclude that the mean years as a customer is different between graduate degree holding customers and customers without a graduate degree.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dxlix 1. Annual Household Income: cont.

Population 1 = graduate degree (120), 2 = not (60)H0: 12   22 ; H1: 12   22 F Test for Differences in Two Variances Data Level of Significance

0.05

Larger-Variance Sample Sample Size

120

Sample Variance

299586226.1

Smaller-Variance Sample Sample Size

60

Sample Variance

93303111.51

Intermediate Calculations F Test Statistic

3.2109

Population 1 Sample Degrees of Freedom

119

Population 2 Sample Degrees of Freedom

59

Copyright ©2024 Pearson Education, Inc.


Two-Tail Test Upper Critical Value

1.5867

p-Value

0.0000 Reject the null hypothesis

S12 = 3.2109 S22 Decision: Since FSTAT = 3.2109 > 1.5867 and the p-value < 0.05, reject H0. There is enough evidence to conclude that the two population variances are different. Hence, the appropriate test for the difference in two means is the separate-variance t test. Decision rule: If FSTAT > 1.5867, reject H0.Test statistic: FSTAT 

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dli 1. Annual Household Income: cont. Population 1 = graduate degree (120), 2 = not (60)H0: 1  2 H1: 1  2 Separate-Variances t Test for the Difference Between Two Means (assumes unequal population variances) Data Hypothesized Difference

0

Level of Significance

0.05

Population 1 Sample Sample Size

120

Sample Mean

58319.275

Sample Standard Deviation

17308.5593

Population 2 Sample Sample Size

60

Sample Mean

44520.18333

Sample Standard Deviation

9659.3536

Intermediate Calculations Numerator of Degrees of Freedom Denominator of Degrees of Freedom

16415492889630.4000 93362437688.9342

Total Degrees of Freedom

175.8255

Degrees of Freedom

175

Standard Error

2012.8596

Difference in Sample Means

13799.0917

Separate-Variance t Test Statistic

Copyright ©2024 Pearson Education, Inc.

6.8555


Two-Tail Test Lower Critical Value

-1.9736

Upper Critical Value

1.9736

p-Value

0.0000 Reject the null hypothesis

Decision: Since the p-value < 0.05, reject H0. There is sufficient evidence to conclude that the mean annual household income is different between graduate degree holding customers and customers without a graduate degree.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dliii 1. Weekly Usage: cont.

Population 1 = graduate degree (120), 2 = not (60)H0: 12   22 ; H1: 12   22 F Test for Differences in Two Variances Data Level of Significance

0.05

Larger-Variance Sample Sample Size

120

Sample Variance

1.162394958

Smaller-Variance Sample Sample Size

60

Sample Variance

0.931920904

Intermediate Calculations F Test Statistic

1.2473

Population 1 Sample Degrees of Freedom

119

Population 2 Sample Degrees of Freedom

59

Copyright ©2024 Pearson Education, Inc.


Two-Tail Test Upper Critical Value

1.5867

p-Value

0.3471

Do not reject the null hypothesis

S12 = 1.2473 S22 Decision: Since FSTAT = 1.2473 < 1.5867 and the p-value = 0.3471 > 0.05, do not reject H0. There is not enough evidence of a difference in the two population variances. Hence, a pooled-variance t test for the difference in two population means can be used. Decision rule: If FSTAT > 1.5867, reject H0.Test statistic: FSTAT 

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dlv 1. Weekly Usage: cont. Population 1 = graduate degree (120), 2 = not (60)H0: 1  2 H1: 1  2 Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference

0

Level of Significance

0.05

Population 1 Sample Sample Size

120

Sample Mean

3.675

Sample Standard Deviation

1.078144219

Population 2 Sample Sample Size

60

Sample Mean

3.016666667

Sample Standard Deviation

0.965360505

Intermediate Calculations Population 1 Sample Degrees of Freedom

119

Population 2 Sample Degrees of Freedom

59

Total Degrees of Freedom

178

Pooled Variance

1.0860

Standard Error

0.1648

Difference in Sample Means

0.6583

t Test Statistic

3.9954

Copyright ©2024 Pearson Education, Inc.


Two-Tail Test Lower Critical Value

-1.9734

Upper Critical Value

1.9734

p-Value

0.0001 Reject the null hypothesis

Decision: Since the p-value < 0.05, reject H0. There is sufficient evidence to conclude that the mean weekly usage is different between graduate degree holding customers and customers without a graduate degree.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dlvii 1. Elliptical Miles: cont.

Population 1 = graduate degree (120), 2 = not (60)H0: 12   22 ; H1: 12   22 F Test for Differences in Two Variances Data Level of Significance

0.05

Larger-Variance Sample Sample Size

120

Sample Variance

3005.576513

Smaller-Variance Sample Sample Size

60

Sample Variance

1825.137751

Intermediate Calculations F Test Statistic

1.6468

Population 1 Sample Degrees of Freedom

119

Population 2 Sample Degrees of Freedom

59

Copyright ©2024 Pearson Education, Inc.


Two-Tail Test Upper Critical Value

1.5867

p-Value

0.0345 Reject the null hypothesis

S12 = 1.6468 S22 Decision: Since FSTAT = 1.6468 > 1.5867 and the p-value < 0.05, reject H0. There is enough evidence to conclude that the two population variances are different. Hence, the appropriate test for the difference in two means is the separate-variance t test. Decision rule: If FSTAT > 1.5867, reject H0.Test statistic: FSTAT 

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dlix 1. Elliptical Miles: cont. Population 1 = graduate degree (120), 2 = not (60)H0: 1  2 H1: 1  2 Separate-Variances t Test for the Difference Between Two Means (assumes unequal population variances) Data Hypothesized Difference

0

Level of Significance

0.05

Population 1 Sample Sample Size

120

Sample Mean

109.875

Sample Standard Deviation

54.8231

Population 2 Sample Sample Size

60

Sample Mean

89.77666667

Sample Standard Deviation

42.7216

Intermediate Calculations Numerator of Degrees of Freedom

3076.4143

Denominator of Degrees of Freedom

20.9549

Total Degrees of Freedom

146.8111

Degrees of Freedom

146

Standard Error

7.4475

Difference in Sample Means

20.0983

Separate-Variance t Test Statistic

2.6987

Copyright ©2024 Pearson Education, Inc.


Two-Tail Test Lower Critical Value

-1.9763

Upper Critical Value

1.9763

p-Value

0.0078 Reject the null hypothesis

Decision: Since the p-value < 0.05, reject H0. There is sufficient evidence to conclude that the mean elliptical miles is different between graduate degree holding customers and customers without a graduate degree.

2.

At the 5% level of significance, there is not enough evidence of a difference between graduate degree holding customers and customers without a graduate degree in their years as a customer, annual household income, weekly usage, and elliptical miles.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dlxi

Chapter 11

1.

Years as a Customer, based on the elliptical purchased H0: 12   22   32 H1: At least one variance is different. Levene’s test for equality of variances: SUMMARY Groups

Count

Sum

Average

Variance

EX-10

80 96.9

1.21125

0.6443

EX-11

60 76.6 1.276666667

1.0345

AIX-12

40 52.9

1.3154

1.3225

ANOVA Source of Variation

SS

df

MS

Between Groups

0.3622

2

0.1811

Within Groups

163.2370

177

0.9222

Total

163.5991

179

F 0.1963

P-value

F crit

0.8219 3.0470

Level of significance

0.05

Since p-value = 0.8219 > 0.05, do not reject H0. There is not enough evidence that the equal variance assumption needed for the one-way ANOVA F test for the difference in the means is violated. Hence, the one-way ANOVA F test for the difference in the mean can be used. H0: 1  2  3 H1: At least one mean is different. Decision rule: df: 2,177. If F > 3.047, reject H0. One-way ANOVA F test for difference in the means: ANOVA: Single Factor SUMMARY Groups EX-10

Count

Sum

80 261.1

Average Variance 3.26375

2.0596

Copyright ©2024 Pearson Education, Inc.


EX-11

60 190.2

3.17

2.6428

AIX-12

40 142.7

3.5675

3.0058

MS

F 0.8084

ANOVA Source of Variation

SS

df

Between Groups

3.9814

2

1.9907

Within Groups

435.8586

177

2.4625

Total

439.8400

179

P-value

F crit

0.4472 3.0470

Level of significance

0.05

Test statistic: FSTAT = 0.8084 Since p-value = 0.4472 > 0.05, and FSTAT = 0.8084 < 3.047, do not reject H0. There is insufficient evidence to conclude that the years as a customer across the product purchased (EX-10, EX-11, and AIX-12) are different.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dlxiii 1. Annual Household Income ($), based on the elliptical purchased cont. H0: 12   22   32 H1: At least one variance is different. Levene’s test for equality of variances: SUMMARY Groups

Count

Sum

Average

Variance

EX-10

80

598062

7475.775

25815287.7968

EX-11

60

405183

6753.05

28754955.3025

AIX-12

40

659391

16484.775

65052816.4609

ANOVA Source of Variation

SS

df

Between Groups

2719563231.0250

Within Groups

6273009940.7750

177

Total

8992573171.8000

179

MS

F

2 1359781615.5125 38.3678

P-value

F crit

0.0000 3.0470

35440734.1287

Level of significance

0.05

Since p-value = 0.0000 < 0.05, reject H0. There enough evidence to conclude that the variances in annual household income across the products (EX-10, EX-11, AIX-12) are different. H0: 1  2  3 H1: At least one mean is different. Decision rule: df: 2,177. If F > 3.047, reject H0. One-way ANOVA F test for difference in the means: ANOVA: Single Factor SUMMARY Groups

Count

Sum

Average

Variance

EX-10

80

3713442

46418.025

82369840.5057

EX-11

60

2938419

48973.65

74891532.3331

AIX-12

40

3017663

75441.575

342465992.7122

ANOVA Copyright ©2024 Pearson Education, Inc.


Source of Variation

SS

df

MS

Between Groups

24490250198.5361

Within Groups

24281991523.3750

177

Total

48772241721.9111

179

F

2 12245125099.2681 89.2590

P-value

F crit

0.0000 3.0470

137186392.7874

Level of significance

0.05

Test statistic: FSTAT = 89.2590 Since p-value = 0.0000 < 0.05, and FSTAT = 89.2590 > 3.047, reject H0. There is enough evidence to conclude that there is a significant difference in annual household income across the product purchased (EX-10, EX-11, AIX-12).

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dlxv 1. Annual Household Income ($), based on the elliptical purchased cont.

From the Tukey Pairwise Comparison procedure, there is a difference in mean annual household income between the customers of EX-10 and AIX-12, and EX-11 and AIX-12.

Copyright ©2024 Pearson Education, Inc.


1. Usage (mean number of times the customer plans to use the elliptical each week), based on the cont. elliptical purchased H0: 12   22   32 H1: At least one variance is different. Levene’s test for equality of variances: SUMMARY Groups

Count

Sum

Average

EX-10

80

45

0.5625

0.2998

EX-11

60

32 0.533333333

0.3548

AIX-12

40

31

0.3327

0.775

Variance

ANOVA Source of Variation

SS

df

MS

F

Between Groups

1.6042

2

0.8021

Within Groups

57.5958

177

0.3254

Total

59.2000

179

2.4649

P-value

F crit

0.0879 3.0470

Level of significance

0.05

Since p-value = 0.0879 > 0.05, do not reject H0. There is not enough evidence that the equal variance assumption needed for the one-way ANOVA F test for the difference in the means is violated. Hence, the one-way ANOVA F test for the difference in the mean can be used. H0: 1  2  3 H1: At least one mean is different. Decision rule: df: 2,177. If F > 3.047, reject H0. One-way ANOVA F test for difference in the means: ANOVA: Single Factor SUMMARY Groups

Count

Sum

Average

EX-10

80

247

3.0875

0.6125

EX-11

60

184 3.066666667

0.6395

AIX-12

40

191

0.8968

4.775

Variance

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dlxvii

ANOVA Source of Variation

SS

df

MS

Between Groups

89.5486

2

44.7743

Within Groups

121.0958

177

0.6842

Total

210.6444

179

F 65.4445

P-value

F crit

0.0000 3.0470

Level of significance

0.05

Test statistic: FSTAT = 65.4445 Since p-value = 0.0000 < 0.05, and FSTAT = 65.4445 > 3.047, reject H0. There is enough evidence to conclude that there is a significant difference in usage (mean number of times the customer plans to use the elliptical each week) across the product purchased (EX-10, EX-11, AIX-12).

Copyright ©2024 Pearson Education, Inc.


1. Usage (mean number of times the customer plans to use the elliptical each week), based on the cont. elliptical purchased

From the Tukey Pairwise Comparison procedure, there is a difference in usage (mean number of times the customer plans to use the elliptical each week) between the customers of EX-10 and AIX-12, and EX-11 and AIX-12.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dlxix 1. Miles (mean number of elliptical miles the customer expects to exercise each week), based on the cont. elliptical purchased H0: 12   22   32 H1: At least one variance is different. Levene’s test for equality of variances: SUMMARY Groups

Count

Sum

Average

Variance

EX-10

80

1710

21.375

373.3867

EX-11

60 1420.4

23.67333333

546.0569

AIX-12

40

1744

43.6 1707.1179

ANOVA Source of Variation

SS

df

MS

F 9.8070

Between Groups

14216.5007

2

7108.2503

Within Groups

128292.5073

177

724.8164

Total

142509.0080

179

P-value

F crit

0.0001 3.0470

Level of significance

0.05

Since p-value = 0.0001 < 0.05, reject H0. There enough evidence to conclude that the variances in miles across the products (EX-10, EX-11, AIX-12) are different. H0: 1  2  3 H1: At least one mean is different. Decision rule: df: 2,177. If F > 3.047, reject H0. One-way ANOVA F test for difference in the means: ANOVA: Single Factor SUMMARY Groups

Count

Sum

Average

Variance

EX-10

80 6616.8

82.71

EX-11

60 5278.8

87.98 1105.6986

AIX-12

40

166.9 3607.9897

6676

Copyright ©2024 Pearson Education, Inc.

832.4434


ANOVA Source of Variation

SS

df

MS

Between Groups

209793.6044

2 104896.8022

Within Groups

271710.8480

177

Total

481504.4524

179

F 68.3327

P-value

F crit

0.0000 3.0470

1535.0895

Level of significance

0.05

Test statistic: FSTAT = 68.3327 Since p-value = 0.0000 < 0.05, and FSTAT = 68.3327 > 3.047, reject H0. There is enough evidence to conclude that there is a significant difference in miles (mean number of elliptical miles the customer expects to exercise each week) across the product purchased (EX-10, EX-11, AIX-12).

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dlxxi 1. Miles (mean number of elliptical miles the customer expects to exercise each week), based on the cont. elliptical purchased

From the Tukey Pairwise Comparison procedure, there is a difference in miles (mean number of elliptical miles the customer expects to exercise each week) between the customers of EX-10 and AIX-12, and EX-11 and AIX-12. 2.

There is not enough evidence of any difference between customers based on the product purchased (EX-10, EX-11, AIX-12) in the number of years as a customer. There is enough evidence of a difference between customers based on the product purchased (EX-10, EX-11, AIX-12) in annual household income. There is enough evidence of a difference between customers based on the product purchased (EX-10, EX-11, AIX-12) in usage (mean number of times the customer plans to use the elliptical each week). There is enough evidence of a difference between customers based on the product purchased (EX-10, EX-11, AIX-12) in miles (mean number of elliptical miles the customer expects to exercise each week).

Copyright ©2024 Pearson Education, Inc.


Chapter 12

1.

Years as a customer: Population 1 = EX-10, 2 = EX-11, 3 = AIX-12 H0: M1  M 2  M 3 H1: Not all M j s are the same. Kruskal-Wallis Rank Test for Differences in Medians

Data

Level of Significance

0.05

Intermediate Calculations Sum of Squared Ranks/Sample Size

1478330

Sum of Sample Sizes

180

Number of Groups

3

Group

Sample Sum of Size Ranks

Mean Ranks

EX-10

80

7301

91.2625

EX-11

60

5083.5

84.725

AIX-12

40

3905.5

97.6375

Test Result H Test Statistic

1.5047

Critical Value

5.9915

p-Value

0.4713

Do not reject the null hypothesis

Because H = 1.5047 < 5.9915 or p-value = 0.4713 > 0.05, do not reject H0. At the 0.05 significance level, there is not enough evidence of any difference between customers based on the product purchased (EX-10, EX-11, AIX-12) in median years as a customer.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dlxxiii 1. Annual household income ($): cont. Population 1 = EX-10, 2 = EX-11, 3 = AIX-12 H0: M1  M 2  M 3 H1: Not all M j s are the same. Kruskal-Wallis Rank Test for Differences in Medians

Data

Level of Significance

0.05

Intermediate Calculations Sum of Squared Ranks/Sample Size

1640847

Sum of Sample Sizes

180

Number of Groups

3

Group

Sample Sum of Size Ranks

Mean Ranks

EX-10

80

5571.5

EX-11

60

4851.5 80.8583333

AIX-12

40

5867

69.64375

146.675

Test Result H Test Statistic

61.3634

Critical Value

5.9915

p-Value

0.0000

Reject the null hypothesis

Because H = 61.3634 > 5.9915 or p-value = 0.0000 < 0.05, reject H0. There is enough evidence of a difference between customers based on the product purchased (EX-10, EX-11, AIX-12) in their median annual household income.

Copyright ©2024 Pearson Education, Inc.


1. Number of times the customer plans to use the elliptical each week: cont. Population 1 = EX-10, 2 = EX-11, 3 = AIX-12 H0: M1  M 2  M 3 H1: Not all M j s are the same. Kruskal-Wallis Rank Test for Differences in Medians

Data

Level of Significance

0.05

Intermediate Calculations Sum of Squared Ranks/Sample Size

1644559

Sum of Sample Sizes

180

Number of Groups

3

Group

Sample Size

Sum of Ranks

Mean Ranks

EX-10

80

5992

74.9

EX-11

60

4377

72.95

AIX-12

40

5921

148.025

Test Result H Test Statistic

62.7307

Critical Value

5.9915

p-Value

0.0000

Reject the null hypothesis

Because H = 62.7307 > 5.9915 or p-value = 0.0000 < 0.05, reject H0. There is enough evidence of a difference between customers based on the product purchased (EX-10, EX-11, AIX-12) in the median mean number of times the customer plans to use the elliptical each week.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dlxxv 1. Number of elliptical miles: cont. Population 1 = EX-10, 2 = EX-11, 3 = AIX-12 H0: M1  M 2  M 3 H1: Not all M j s are the same. Kruskal-Wallis Rank Test for Differences in Medians

Data

Level of Significance

0.05

Intermediate Calculations Sum of Squared Ranks/Sample Size

1654611

Sum of Sample Sizes

180

Number of Groups

3

Group

Sample Size

Sum of Ranks

Mean Ranks

EX-10

80

5516

68.95

EX-11

60

4814 80.2333333

AIX-12

40

5960

149

Test Result H Test Statistic

66.4333

Critical Value

5.9915

p-Value

0.0000

Reject the null hypothesis

Because H = 66.4333 > 5.9915 or p-value = 0.0000 < 0.05, reject H0. There is enough evidence of a difference between customers based on the product purchased (EX-10, EX-11, AIX-12) in the median mean number of elliptical miles.

Copyright ©2024 Pearson Education, Inc.


2.

Risk based on market cap: H 0 : There is no relationship between risk and market cap H1 : There is relationship between risk and market cap

PHStat output of the chi-square test: Chi-Square Test

Observed Frequencies Risk Level Market Cap

Low

Average

High

Total

Mid-Cap

60

111

63

234

Small

53

79

120

252

Large

195

286

140

621

Total

308

476

323

1107

Expected Frequencies Risk Level Market Cap

Low

Mid-Cap

Average

High

Total

65.10569 100.6179 68.27642

234

Small 70.11382 108.3577 73.52846

252

Large

621

Total

172.7805 267.0244 181.1951 308

476

323

Data Level of Significance

0.05

Number of Rows

3

Number of Columns

3

Degrees of Freedom

4 Copyright ©2024 Pearson Education, Inc.

1107


Solutions to End-of-Section and Chapter Review Problems dlxxvii

Results Critical Value

9.487729

Chi-Square Test Statistic

56.95336

p-Value

1.27E-11

Reject the null hypothesis

Expected frequency assumption is met. Since p-value = 0.0000 < 0.05, reject H0. There is enough evidence that risk is related to market cap and, hence, a difference in risk based on market cap.

Copyright ©2024 Pearson Education, Inc.


2. Relationship Status (single or partnered): cont. H 0 : There is no relationship between relationship status and product purchased H1 : There is relationship between relationship status and product purchased Chi-Square Test

Observed Frequencies Product Purchased Relationship Status

EX-11

EX-10

AIX-12

Total

Partnered

36

48

23

107

Single

24

32

17

73

Total

60

80

40

180

AIX-12

Total

Partnered 35.66667 47.55556 23.77778

107

Expected Frequencies Product Purchased Relationship Status

Single Total

EX-11

EX-10

24.33333 32.44444 16.22222 60

80

40

Data Level of Significance

0.05

Number of Rows

2

Number of Columns

3

Degrees of Freedom

2

Results Critical Value

5.991465 Copyright ©2024 Pearson Education, Inc.

73 180


Solutions to End-of-Section and Chapter Review Problems dlxxix Chi-Square Test Statistic

0.080655

p-Value

0.960475

Do not reject the null hypothesis

Expected frequency assumption is met. The p-value = 0.9605 > 0.05, do not reject H0. There is not enough evidence of any difference in the relationship status (single or partnered) based on the product purchased (EX-10, EX-11, AIX12).

Copyright ©2024 Pearson Education, Inc.


2. Fitness: cont. H 0 : There is no relationship between fitness and product purchased H1 : There is relationship between fitness and product purchased Chi-Square Test

Observed Frequencies Product Purchased Fitness

EX-11

EX-10

AIX-12

Total

1

1

1

0

2

2

12

14

0

26

3

39

54

4

97

4

8

9

7

24

5

0

2

29

31

Total

60

80

40

180

AIX-12

Total

Expected Frequencies Product Purchased Fitness

EX-11

EX-10

1

0.666667 0.888889 0.444444

2

2

8.666667 11.55556 5.777778

26

3

32.33333 43.11111 21.55556

97

4

8 10.66667 5.333333

24

5

10.33333 13.77778 6.888889

31

Total

60

80

40

Data

Copyright ©2024 Pearson Education, Inc.

180


Solutions to End-of-Section and Chapter Review Problems dlxxxi Level of Significance

0.05

Number of Rows

5

Number of Columns

3

Degrees of Freedom

8

Results Critical Value

15.50731

Chi-Square Test Statistic

118.7768

p-Value

5.93E-22

Reject the null hypothesis

Expected frequency assumption is violated.

3.

The p-value = 0.0000 < 0.05, reject H0. There is enough evidence of a difference in the self-rated fitness based on the product purchased (EX-10, EX-11, AIX-12). However, the expected frequency assumption required for the chi-square test is violated. Hence, the conclusion might not be reliable. Refer to the conclusions of the various hypothesis tests in parts (1) and (2).

The Tri-Cities Times Case

Chapter 0

1.

Each company’s own historical data would be a primary data source. If available, data about business operations, like formal accounting statements, and revenue as well as subscription and advertising including breakdowns and analyses. The research team might use secondary data sources published by some private market research companies, or secondary data sources published by governmental agencies. A third-party that analyze advertising, market share, and intangibles such as reputation and perception of the existing business.

2.

Using DCOSAC, to define the data, collect the data, organize and summarize the data, analyze and reach conclusions about the data, and communicate the results of the analysis would be essential to combine the two businesses. Start with defining a business goal to be achieved, or a problem to be Copyright ©2024 Pearson Education, Inc.


solved. Then take that goal and work through the framework of DCOSAC, identifying data needs and the actionable information that would be needed. When combining operations, there is often talk about ―operational efficiencies‖ to be had that will lower expenses. In the best situations, achieving such a goal can eliminate unnecessary duplication. In other situations, the goal may be just a euphemism for large-scale layoffs or business closures which comes as a surprise for interested parties who failed to require more than just the Define step when accepting terms. For example, in human resources, determine who is employed with each company, what job each employee has, and use the information to determine how many and which employees will be needed with the new company. 3.

(a)

One needs to know more to be able to determine if Staff&Save is an application of business statistics. A reasonable observation is that the software sounds more like an OR/mgt. sci. or AI application rather than business statistics. However, a description of software features does not explain how the software works, which would be important to know. Some points about FTF.1 might be mentioned. Best answers would question whether evaluation data could be defined in a meaningful way for some of the factors being considered. Software to minimize staffing has been used in the retail industry for some time, but Staff&Save is talking about company-wide staffing and hiring, a much broader concept. The phrase ―to determine who gets retained and scheduled‖ suggests that this software may be supplying management decisions and not actionable information for decision-making. That software produces actionable information to assist decision-making, and is not a replacement for that decision-making, is a constant theme in this text.

(b)

The short answer is every step of DCOSAC could be bypassed. Even if a company sought ―minimal staffing,‖ that goal would first need to be well-defined for that company. For example, by only using data related to employee availability, preference, and qualification, Staff&Save bypasses defining data and collecting data with respect to other areas of import, which could result in an analysis that is missing pertinent data. The benefit of bypassing the skipped tasks is saved time and money. There are no true benefits to bypassing tasks in the DCOSAC or similar framework. The general risk is always to increase the chance of error which can include making a poor or uninformed decisions.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dlxxxiii

Chapter 1

1.

This is a sampling technique issue. Answers may vary. For example, the intern suggests that an Internet poll could be posted on a social media site as well as on the new website for the Times. The advantages could include convenient sampling and avoiding coding errors and data integration errors. The disadvantages of using an Internet poll posted on a social media site and the new website include coverage error, nonresponse error, and sampling error. Other errors could include missing values. For a fact-based decision-making process, the disadvantages outweigh the advantages because Internet poll may not represent new subscribers.

2.

This is an open response, so check all answers for plausibility of variable type. For example, useful demographic facts about subscribers could include age (years), gender, education level (years completed), marital status, household income ($), and employment status.

3.

One would expect a wider range of values, however the survey responses may not include data decision makers need. Answers may vary. For example, an Internet poll would be subject to coverage error by only including responders who use the Internet. These demographics typically leave out older population. An Internet poll could only include responders who don’t think that the poll is just a scam/spam. Increasingly, tech-savvy young people avoid scam/spam situations. In addition, Internet poll participants might be non-subscribers.

4.

For ordinal or nominal, the domain should be a set of values, or perhaps a scale of 1 to 5, or 1 to 7, or even five categories from strongly dislike to strongly like would be best to use. For free response, one cannot quantify attitudes. ―I like that change 79%‖ does not make sense, whereas ―I strongly like that proposed change‖ does.

5.

Possible problems that could be listed: two differently named variables representing the same fact and variable, but with a different domain (including different codings); each subscriber file may have variables unique to the file. Answers will vary based on what problems the student chooses. For example, separate subscriber processing systems could include different entries in the datasets. For example, one company could use CC for general credit card identifier, and the other company could use VISA or MC, or OTHER. Data cleaning could help with combining this data. Also, separate subscriber processing systems could include different categories of data. Combining different datasets would result in final dataset missing category data. For example, if one company included a category of ―best contact method,‖ like cell phone number, email address, street address, and the other company did not keep track of this category of data, when the data are combined, that category would be empty for all of the second company’s subscribers.

Copyright ©2024 Pearson Education, Inc.


Chapter 2

1.

No, not in the current form because the data needs cleaning. There is variation in how categories are coded and some values need recoding. For example, ―donate money‖ and ―I will donate‖ all map to the same categorical value. Also, a ―Venmo‖ response maps to, and should be recoded to, the categorical value ―Other.‖ Because the variable is categorical, the solution could start with a tabular summary of the data and discover all of the unique values for the data set.

2.

This question requires that the data cleaning be done. There should be 5 categories in the summary, but what the values are for the five categories is a bit open. For example, ―Donate money‖, ―donation‖, or ―I will donate‖ are all plausible for one of the categories, although given the frequencies of these values, ―donation‖ is likely to be chosen. The most appropriate summaries would be a summarization of the cleaned data, so readers have a second chance to clean data here in case they missed the point of question 1.

3.

Presenting tabular form of the table data. Subscriptions

4.

Frequency Percentage

Digital

54

33.75%

Print

47

29.38%

Print with Digital

59

36.88%

Total

160

100.00%

For Table 1, the chart is fine as is. For Table 2, the chart is passable. A change to make the chart better would be to eliminate gradient effect which is not modeled in this textbook. Reordering the bars here would not improve the chart because the variable of interest is operating system. A recoding of the data would lose information but would not necessarily be wrong. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dlxxxv For Table 3, there are too many categories. The background should be white and a more distinctive coloring for the slices should be used. As is, the chart is not fully accessible and does not meet the WCAG standard mentioned in a sidenote on page 85. For Table 4, the chart is okay given the table data, but a better chart would have graphed the bounce rate itself as a time-series chart. For Table 5, the map chart is fine, but the 3D chart violates best practices. The second chart could be a bar chart, but given the data, a Pareto chart would better show off the ―vital few‖ from the ―trivial many.‖

Copyright ©2024 Pearson Education, Inc.


Chapter 3

1.

For the daily website users (Table 1), the following descriptive statistics are computed: Excel Results Current Period Prior Period Mean

54.13333333

66

Median

51

69.5

Mode

38

73

Minimum

31

31

Maximum

98

97

Range

67

66

Variance

227.0851

271.8621

Standard Deviation

15.0693

16.4882

Coeff. of Variation

27.84%

24.98%

Skewness

1.0303

-0.4107

Kurtosis

1.0622

-0.4980

30

30

Standard Error

2.7513

3.0103

First Quartile

46

54

Third Quartile

64

80

Count

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dlxxxvii

Copyright ©2024 Pearson Education, Inc.


1. For the changes in bounce rate (Table 4), the following descriptive statistics are computed: cont. Excel Results Percentage Change Mean

0.108733333

Median

-0.1215

Mode

#N/A

Minimum

-0.563

Maximum

2.166

Range

2.729

Variance

0.4063

Standard Deviation

0.6374

Coeff. of Variation

586.22%

Skewness

1.9030

Kurtosis

3.8387

Count

30

Standard Error

0.1164

First Quartile

-0.301

Third Quartile

0.353

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dlxxxix

Copyright ©2024 Pearson Education, Inc.


2.

For daily website users, refer to the chart associated with Table 1. For changes in bounce rate (Table 4), the following Time Series is included.

3.

Daily Website Users Current Period: In the current period, the mean of daily website users is 54.1 visitors with a median of 51 visitors. The data are right-skewed. Twenty-five percent of the daily website users are below 46 visitors, 50% are below 51 visitors, and 75% are below 64 visitors, with the least number of 31 visitors and the most with 98 visitors, giving a range of 67. Daily Website Users Prior Period: In the prior period, the mean of daily website users was 66 visitors with a median of 69.5 visitors. The data are left-skewed. Twenty-five percent of the prior period daily website users were below 54 visitors, 50% were below 69.5 visitors, and 75% were below 80 visitors, with the least number of 31 visitors and the most with 97 visitors, giving a range of 66. Changes in Bounce Rate: The mean of changes in the bounce rate is 10.87% with a median of –12.15%. The data are rightskewed. Twenty-five percent of the changes in the bounce rate are below –30.1%, 50% are below –12.15%, and 75% were below 35.3%, with the smallest bounce rate of –56.3% and the largest bounce rate of 216.6%, giving a range of 272.9%

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dxci

Chapter 5

1.

Assume that the assumptions for using a binomial distribution are satisfied and let  denote the probability that a customer will subscribe to the 3-At Large service, (a)   0.10, P(X < 3) = P(X = 0) + P(X = 1) + P(X = 2) = 0.0052 + 0.0286 + 0.0779 = 0.1117 (b)   0.10, P(X = 0) + P(X = 1) = 0.0052 + 0.0286 = 0.0338 (c)   0.10, P(X > 3) = 1 – P(X  3) = 1 – [P(X = 0) + P(X = 1) + P(X = 2) + P(X = 3)] = 1 – (0.0052 + 0.0286 + 0.0779 + 0.1386) = 1 – 0.2503 = 0.7497 (d) P(X = 4) = 0.1809 The likelihood that you would get 4 subscribers in a sample of 50 if the probability of a subscription is 0.10 is only 0.1809. Thus, you can conclude that it is more likely than 0.10 that you will get new subscribers when no free premium channels are included.

2.

(a) (b) (c)

  0.20, P(X < 3) = P(X = 0) + P(X = 1) + P(X = 2) = 0.0000 + 0.0002 + 0.0011 = 0.0013   0.20, P(X = 0) + P(X = 1) = 0.0000 + 0.0002 = 0.0002   0.20, P(X > 3) = 1 – P(X  3) = 1 – [P(X = 0) + P(X = 1) + P(X = 2) + P(X = 3) )]

(d)

= 1 – (0.0000 + 0.0002 + 0.0011 + 0.0044) = 1 – 0.0057 = 0.9943 The likelihood that you would get fewer than 3 subscribers when two complimentary channels are offered is 0.0013. This is much lower than when no free channels are offered, which is a probability of 0.1117. The likelihood that you would get fewer than 0 or 1 subscribers when two complimentary channels are offered is 0.0002. This is much lower than when no free channels are offered, which is a probability of 0.0338. The likelihood that you would get more than 3 subscribers when two complimentary channels are offered is 0.9943. This is much higher than when no free channels are offered, which is a probability of 0.7497.

(e)

If no premium channels were offered, the probability of getting six or more subscriptions is small (0.3839) assuming that the probability of a new subscription is 0.10. If two premium channels were offered, the probability of getting six or more subscriptions is large (0.9520) assuming that the probability of a new subscription is 0.20. Thus, you can conclude that it is more likely than 0.20 that you will get new subscribers when two free premium channels are included.

3.

There is no single answer here, but a lot can be discussed. The ultimate question is to determine the number of premium channels to offer free. On one hand you want to maximize the chance of a new subscription, on the other hand, you don’t want to give away more premium channels than necessary. A reasonable conclusion might be to offer one premium channel as an incentive so that you can advertise that there is something being given away for free. Additional free premium channels may not produce sufficiently more new subscriptions.

4.

For a sample of 100 people, how many customers are likely to skip the offers? = 100(0.25) = 25

5.

(a) (c)

25% + 20% = 45% 15%

(b)20% (d)10% Copyright ©2024 Pearson Education, Inc.


6.

Part (a) group, because 25% + 20% = 45% of all subscribers, which is literally the greatest number of subscribers. It could also be argued for part (c) group, due to costs and effectiveness of the offer.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dxciii

Chapter 6

1.

Subscribers: n = 1 µ = 14.6  = 1.3 (a) P(< 6) = 0.0000 (b) P(> 14) = 0.6778 Non-subscribers: n = 1 µ = 6.3  = 2.5 (a) P(< 6) = 0.4522 (b) P(> 14) = 0.0010

2.

The probability of a subscriber spending 15 minutes or more on the website is 0.3792. So, only 37.92% of subscribers spend 15 minutes or more on the website. The goal is met 37.92% of the time.

3.

The probability of a non-subscriber spending 8 minutes or more on the website is 0.2483. So, only 24.83% of non-subscribers spend 8 minutes or more on the website. The goal is met 24.83% of the time.

Copyright ©2024 Pearson Education, Inc.


Chapter 7

1.

n = 25 µ = 6.0  = 1.5 (a) P(< 6.0) = 0.5000 (b) P(between 5.25 and 6.75) = 0.3829 (c) P(between 6.0 and 6.75) = 0.1915 (d) P(less than 0.95 or greater than 1.05) = 0.9999 (e) P( X < 5.7) = P(Z < –1.00) = 0.1587 If the time spent on the website is normally distributed with a mean of 6.0 and a standard deviation of 1.5, the probability is 0.1587 of obtaining a sample that will yield a sample mean time spent on the website of 5.7 or less, a rather unlikely event. The fact that today’s sample of 25 yields a sample mean time spent on the website of 5.7 indicates that the distribution of the upload speed is most likely not normally distributed with a mean of 6.0 and a standard deviation of 1.5.

2.

The results here are based on a sample of 25. The standard error of the mean is 0.03 and thus there are more means than individual times spent on the website close to the population mean.

3.

 p    0.14,  p  (a) (b) (c)

 (1   )

n P(p < 16%) = 0.7178 P(12% < p < 16%) = 0.4356 P(p > 5%) = 0.0.9953

0.14(1  0.14) = 0.0346987031 100

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dxcv

Chapter 8

1.

95% confidence interval constructed with n = 300: Time:7.09    7.95 You are 95% confident that the mean time of a live chat session is between 7.09 and 7.95 minutes.

2.

95% confidence interval for the proportion constructed with n = 300: Subscribers (192):0.5857    0.6943 You are 95% confident that the proportion of the subscribers is between 0.5857 and 0.6943.

3.

For the sample, the sample mean is 7.5, and the sample standard deviation is 3.7834, yielding a half width of 0.5663 minutes. There are two different answers here. First answer: digital managers would have to first determine the sampling error to allow (and other, business factors) in order to determine best the sample size. Second answer: There is no one best sample size. The best sample size would be one that yields an interval estimate that would be useful for decision making. For the help chat sample, the sample mean is 7.5 and the sample standard deviation is 3.783 that creates an interval half width that is less than 0.6 minutes. That might be sufficient for a planning phase in which managers are using whole number estimates of chat times.

Copyright ©2024 Pearson Education, Inc.


Chapter 9

Since the population standard deviation is unknown and the sample size is large enough at 50, the t test for the mean can be used. 1.

H 0 :   3 H1 :   3 Use the t Test for Hypothesis of the Mean Data

Null Hypothesis

=

3

Level of Significance

0.1

Sample Size

50

Sample Mean

3.0995

Sample Standard Deviation

0.9526775

Intermediate Calculations Standard Error of the Mean

0.1347

t Test Statistic

0.7385

Upper-Tail Test Upper Critical Value

1.2991

p-Value

0.2319

Do not reject the null hypothesis Since tSTAT = 0.7385 < tCRITICAL = 1.2991 or the p-value = 0.2319 > 0.05, do not reject the null hypothesis. At 0.10 level of significance, there is insufficient evidence to conclude that the mean response time is greater than 3 seconds. 2.

H 0 :   3 H1 :   3 Use the t Test for Hypothesis of the Mean Data

Null Hypothesis Level of Significance Sample Size

=

3 0.01 50

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dxcvii Sample Mean

3.0995

Sample Standard Deviation

0.9526775

Intermediate Calculations Standard Error of the Mean

0.1347

t Test Statistic

0.7385

Upper-Tail Test Upper Critical Value

2.4049

p-Value

0.2319

Do not reject the null hypothesis Since tSTAT = 0.7385 < tCRITICAL = 2.4049 or the p-value = 0.2319 > 0.05, do not reject the null hypothesis. At 0.01 level of significance, there is insufficient evidence to conclude that the mean response time is greater than 3 seconds. 3.

There is insufficient evidence to conclude that the mean response time is greater than 3 seconds at both the 0.10 and 0.01 level of significance.

Copyright ©2024 Pearson Education, Inc.


Chapter 10

1.

(a)

You need to test whether the variances are equal since in order to conduct the t test for the difference between two independent means, you first need to determine whether the population variances of the two groups are equal. Population Larger Variance = Late, Smaller Variance = Early H0: 12   22 The population variances are the same. H1: 12   22 The population variances are different. PHStat output: F Test for Differences in Two Variances

Data Level of Significance

0.05

Larger-Variance Sample Sample Size

15

Sample Variance

458.0845714

Smaller-Variance Sample Sample Size

15

Sample Variance

320.2868571

Intermediate Calculations F Test Statistic

1.4302

Population 1 Sample Degrees of Freedom

14

Population 2 Sample Degrees of Freedom

14

Two-Tail Test Upper Critical Value

2.9786

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dxcix p-Value

0.5119

Do not reject the null hypothesis

Decision rule: If FSTAT > 2.9786, reject H0. S2 Test statistic: FSTAT  12 = 1.4302 S2 Decision: Since FSTAT = 1.4302 < 2.9786 and the p-value = 0.5119 > 0.05, do not reject H0. There is not enough evidence to conclude that the two population variances are different. Hence, the appropriate test for the difference in two means is the pooled-variance t test.

Copyright ©2024 Pearson Education, Inc.


1. (a) cont.

The boxplots do not show serious departure from the normality assumption. Because the sample sizes are each 15, you assume that the two populations are normally distributed with roughly equal variances. Since the boxplots do not show serious departure from normality and the F test shows insufficient evidence of a difference in the variances between the two interfaces, you use the pooled-variance t test for the difference in means of the two independent samples.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dci 1. (a) cont.

Population 1 = Early Evening, 2 = Late Evening H 0 : 1  2 H1 : 1  2 PHStat output: Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference

0

Level of Significance

0.05

Population 1 Sample Sample Size

15

Sample Mean

280.44

Sample Standard Deviation

17.89655992

Population 2 Sample Sample Size

15

Sample Mean

304.32

Sample Standard Deviation

21.40291035

Intermediate Calculations Population 1 Sample Degrees of Freedom

14

Population 2 Sample Degrees of Freedom

14

Total Degrees of Freedom

28

Pooled Variance

389.1857

Standard Error

7.2036

Difference in Sample Means

-23.8800

t Test Statistic

-3.3150

Copyright ©2024 Pearson Education, Inc.


Two-Tail Test Lower Critical Value

-2.0484

Upper Critical Value

2.0484

p-Value

0.0025 Reject the null hypothesis

(b)

2.

Since tSTAT = –3.315 < –2.0484 or the p-value of 0.0025 < 0.05, you reject the null hypothesis at the 5% level of significance. There is enough evidence to conclude that the two population mean call times are different.

For answers in part (a) and (b) in question 1, using a 0.01 level of significance only changes the critical value. (a)

F-test Upper Critical Value at 0.05 level of significance is 2.9786, and at 0.01 is 4.2993. Since the FSTAT = 1.4302 < both, there is no difference in the answer.

(b)

t-test Lower Critical Value at 0.05 level of significance is –2.0484, and at 0.01 is –2.7633. Since the tSTAT = –3.3150 < both, there is no difference in the answer.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dciii 3.

Using the two-tail test from PHStat Z Test for Differences in Two Proportions

Data Hypothesized Difference

0

Level of Significance

0.05

Group 1 Number of Items of Interest

612

Sample Size

2218 Group 2

Number of Items of Interest

535

Sample Size

2112

Intermediate Calculations Group 1 Proportion

0.275924256

Group 2 Proportion

0.253314394

Difference in Two Proportions

0.022609862

Average Proportion

0.2649

Z Test Statistic

1.6853

Two-Tail Test Lower Critical Value

-1.9600

Upper Critical Value

1.9600

p-Value

0.0919

Do not reject the null hypothesis

Copyright ©2024 Pearson Education, Inc.


H0:  1 =  2 H1:  1   2 where Populations: 1 = Early Evening, 2 = Late Evening Decision rule: If ZSTAT < –1.960 or ZSTAT > 1.960, reject H0. Z STAT = 1.6853 Decision: Since ZSTAT = 1.6853 is between the two critical bounds, and p-value = 0.0919 > 0.05, do not reject H0. There is insufficient evidence of a difference between early evening and late evening billing or payment calls in the proportion at the 0.05 level of significance.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dcv 4.

Using the upper tail test from PHStat Z Test for Differences in Two Proportions

Data Hypothesized Difference

0

Level of Significance

0.05

Group 1 Number of Items of Interest

612

Sample Size

2218

Group 2 Number of Items of Interest

535

Sample Size

2112

Intermediate Calculations Group 1 Proportion

0.275924256

Group 2 Proportion

0.253314394

Difference in Two Proportions

0.022609862

Average Proportion

0.2649

Z Test Statistic

1.6853

Upper-Tail Test Upper Critical Value

1.6449

p-Value

0.0460

Reject the null hypothesis H0:  1 >  2 H1: 1   2 where Populations: 1 = Early Evening, 2 = Late Evening Copyright ©2024 Pearson Education, Inc.


Decision rule: If ZSTAT > 1.6449, reject H0. Z STAT = 1.6853 Decision: Since ZSTAT = 1.6853 > 1.6449, and p-value = 0.0460 < 0.05, reject H0. There is evidence that the proportion of billing or payment calls made in the early evening is greater than the proportion of such calls made in the late evening at the 0.05 level of significance.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dcvii

Chapter 11

1. First you need to determine whether there is a difference in the variance of the three home pages. H0 : 12   22   32

H1 : 12   22   32 PHStat output: Levene Test on Home Pages SUMMARY Groups

Count

Sum

Average

Variance

Headlines

8 21.36

2.67

3.8179

Subheads

8 15.52

1.94

5.0885

Excerpts

8 20.96

2.62

3.2690

ANOVA Source of Variation

SS

Between Groups

2.6608

Within Groups

Total

df

MS

F

P-value

F crit

2

1.3304

0.3278

0.7241

3.4668

85.2280

21

4.0585

87.8888

23 Level of significance

0.05

Since FSTAT = 0.3278 < 3.4668 or p-value = 0.7241 > 0.05, you do not reject H0. There is insufficient evidence of a difference in the variation of the time spent between the home pages. Now that you can assume that the home pages do not differ in their variances, you can test to determine whether there is a difference in their times. H 0 : 1  2  3 H1 : At least one of the means differs PHStat output: Copyright ©2024 Pearson Education, Inc.


ANOVA: Single Factor SUMMARY Groups

Count

Sum

Average

Variance

Headlines

8 265.36

33.17

11.9514

Subheads

8

230.4

28.8

9.3440

Excerpts

8 227.52

28.44

11.1067

ANOVA Source of Variation

SS

df

MS

F

5.1354

Between Groups

110.9317

2

55.4659

Within Groups

226.8152

21

10.8007

Total

337.7469

23

P-value

F crit

0.0153

3.4668

Level of significance

0.05

1. Since FSTAT = 5.1354 > 3.4668 or p-value = 0.0153 < 0.05, you reject H0. There is evidence of a cont. difference in the mean time spent between the home pages. Now, you can use the Tukey-Kramer multiple comparisons to determine which home pages differ in their mean times. PHstat output: Tukey Kramer Multiple Comparisons

Group

Sample

Sample

Mean

Size

1: Headlines

33.17

8

2: Subheads

28.8

8

3: Excerpts

28.44

8

Other Data

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dcix Level of significance

0.05

Numerator d.f.

3

Denominator d.f.

21

MSW Q Statistic

Comparison

10.80072 3.58 Absolute

Std. Error

Critical

Difference

of Difference

Range

Results

Group 1 to Group 2

4.37 1.161933938

4.16 Means are different

Group 1 to Group 3

4.73 1.161933938

4.16 Means are different

Group 2 to Group 3

0.36 1.161933938

4.16 Means are not different

Headlines has a higher time than Subheads or Excerpts.

Copyright ©2024 Pearson Education, Inc.


2.

PHStat output: Anova: Two-Factor With Replication

Headlines Subheads Excerpts

SUMMARY

Total

Casual

Count

6

6

6

18

Sum

259.51

239.9

207.9

707.31

Average

43.25167

39.98333

34.65

39.295

Variance

19.9944

5.3857

19.0270

26.3686

Count

6

6

6

18

Sum

279.6

242.41

237.7

759.71

Average

46.6

40.40167

39.6167 42.20611

Variance

14.2880

10.9748

13.6537

Count

12

12

12

Sum

539.11

482.31

445.6

Subscriber

21.7757

Total

Average

44.92583

40.1925 37.13333

Variance

18.6406

7.4843

21.5824

SS

df

MS

F

P-value

F crit

ANOVA Source of Variation

Sample

76.2711

1

76.2711

5.4922

0.0259

4.1709

Columns

369.9440

2 184.9720

13.3195

0.0001

3.3158

Interaction

31.8912

2

1.1482

0.3307

3.3158

15.9456

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dcxi Within

416.6178

30

Total

894.7242

35

13.8873

2. At the 5% level of significance, since FSTAT = 1.1482 < 3.3158 or p-value = 0.3307 > 0.05, there is cont. insufficient evidence to conclude that there is interaction between visitors and home pages. At the 5% level of significance, since FSTAT = 13.3195 > 3.3158 or p-value = 0.0001 < 0.05, there is enough evidence to conclude that the mean visitors is different among the home pages. At the 5% level of significance, since FSTAT = 5.4922 < 4.1709 or the p-value = 0.0259 > 0.05, there is insufficient evidence to conclude that the mean visitors is different between the home pages.

Copyright ©2024 Pearson Education, Inc.


Chapter 12

1.

Using Table 1: Chi-Square Test

Observed Frequencies First Subscription Renewed?

Promotional

Direct

Upgrade

Total

Yes

45

162

100

307

No

202

105

40

347

Total

247

267

140

654

Upgrade

Total

Yes

115.9465 125.3349 65.71865

307

No

131.0535 141.6651 74.28135

347

Expected Frequencies First Subscription Renewed?

Promotional

Total

247

Direct

267

140

Data Level of Significance

0.05

Number of Rows

2

Number of Columns

3

Degrees of Freedom

2

Results Copyright ©2024 Pearson Education, Inc.

654


Solutions to End-of-Section and Chapter Review Problems dcxiii Critical Value

5.991465

Chi-Square Test Statistic

135.7376

p-Value

3.35E-30

Reject the null hypothesis

Expected frequency assumption is met.

 2 = 135.7376 > 5.991465. Reject H0. There is evidence of a difference in the proportion of subscribers who renewed their subscription among the three subscriptions.

Copyright ©2024 Pearson Education, Inc.


1. Using Table 1: cont. Marascuilo Procedure

Level of Significance

0.05

Square Root of Critical Value

2.447746831

Sample Proportions Group 1

0.182186235

Group 2

0.606741573

Group 3

0.714285714

MARASCUILO TABLE Proportions

Absolute Differences

Critical Range

| Group 1 - Group 2 |

0.424555338

0.094701947 Significant

| Group 1 - Group 3 |

0.532099479

0.111121834 Significant

| Group 2 - Group 3 |

0.107544141

0.118693822 Not significant

Promotional subscription in a higher proportion of subscribers who renew after a promotion than providing direct subscription or upgrade to subscription.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dcxv 1. Using Table 2: cont. Chi-Square Test

Observed Frequencies Subscription Type Recommend?

Basic

At Large

Total

Yes

334

314

648

No

291

340

631

Total

625

654

1279

Expected Frequencies Subscription Type Recommend?

Basic

At Large

Total

Yes

316.6536 331.3464

648

No

308.3464 322.6536

631

Total

625

654

1279

Data Level of Significance

0.05

Number of Rows

2

Number of Columns

2

Degrees of Freedom

1

Results Critical Value

3.841459

Chi-Square Test Statistic

3.766747

Copyright ©2024 Pearson Education, Inc.


p-Value

0.052281

Do not reject the null hypothesis

Expected frequency assumption is met.

 2 = 3.766747 < 3.841459. Do not reject H0. There is no evidence of a relationship between the proportion of subscribers who would recommend among the two subscription types.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dcxvii 2.

From Table 1, we combine ―direct‖ and ―upgrade‖ to ―Not‖ and complete a Chi-Square Test, Chi-Square Test

Observed Frequencies First Subscription Renewed?

Promotional

Not

Total

Yes

45

262

307

No

202

145

347

Total

247

407

654

Expected Frequencies First Subscription Renewed?

Promotional

Not

Total

Yes

115.9465 191.0535

307

No

131.0535 215.9465

347

Total

247

407

654

Data Level of Significance

0.05

Number of Rows

2

Number of Columns

2

Degrees of Freedom

1

Results Critical Value

3.841459

Chi-Square Test Statistic

131.4728 Copyright ©2024 Pearson Education, Inc.


p-Value

1.95E-30

Reject the null hypothesis

Expected frequency assumption is met.

 2 = 131.4728 > 3.841459. Reject H0. There is evidence of a difference in the proportion of subscribers who renewed their subscription among the promotional first subscription and not (direct and upgrade).

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dcxix 2. From Table 1, we combine ―direct‖ and ―upgrade‖ to ―Not‖ and complete a Chi-Square Test, cont. Marascuilo Procedure

Level of Significance

0.05

Square Root of Critical Value

1.959963985

Sample Proportions Group 1

0.182186235

Group 2

0.643734644

MARASCUILO TABLE Proportions | Group 1 - Group 2 |

Absolute Differences 0.461548409

Critical Range 0.066946645 Significant

Promotional subscription in a higher proportion of subscribers who renew after a promotion than not (providing direct subscription or upgrade to subscription). 3.

Answers may vary.

4.

Answers may vary. The subscribers to each plan are as follows: Plan A 54 subscribers, Plan B 139 subscribers, Plan C 243 subscribers, and Plan D 67 subscribers. There are significantly fewer subscribers to Plan A and Plan D than Plan B or Plan C.

Copyright ©2024 Pearson Education, Inc.


5.

Using Table 3: Chi-Square Test

Observed Frequencies Initial Subscription Renewed?

Plan A

Plan B

Plan C

Plan D

Total

Yes

13

50

189

52

304

No

41

89

54

15

199

Total

54

139

243

67

503

Plan D

Total

Yes 32.63618 84.00795 146.8628 40.49304

304

No

199

Expected Frequencies Initial Subscription Renewed?

Plan A

Total

Plan B

Plan C

21.36382 54.99205 96.13718 26.50696 54

139

243

Data Level of Significance

0.05

Number of Rows

2

Number of Columns

4

Degrees of Freedom

3

Results Critical Value

7.814728

Chi-Square Test Statistic

103.4847 Copyright ©2024 Pearson Education, Inc.

67

503


Solutions to End-of-Section and Chapter Review Problems dcxxi p-Value

2.77E-22

Reject the null hypothesis

Expected frequency assumption is met.

 2 = 103.4847 > 7.814728. Reject H0. There is evidence of a difference in the proportion of subscribers who renewed their subscription based on the initial subscription plan (the four plans).

Copyright ©2024 Pearson Education, Inc.


5. Using Table 3: cont. Marascuilo Procedure

Level of Significance

0.05

Square Root of Critical Value

2.795483483

Sample Proportions Group 1

0.240740741

Group 2

0.35971223

Group 3

0.777777778

Group 4

0.776119403

MARASCUILO TABLE

Proportions

Absolute Differences

Critical Range

| Group 1 - Group 2 |

0.118971489

0.19849654 Not significant

| Group 1 - Group 3 |

0.537037037 0.178914751 Significant

| Group 1 - Group 4 |

0.535378662

| Group 2 - Group 3 |

0.418065548 0.136041203 Significant

| Group 2 - Group 4 |

0.416407173 0.182251326 Significant

| Group 3 - Group 4 |

0.001658375 0.160702078 Not significant

0.21614538 Significant

There is a difference between those who signed up for Plan A and those who signed up for Plan C or Plan D. There is also a difference between those who signed up for Plan B and those who signed up for Plan C or Plan D. There is no significant difference between those who signed up for Plan A to Plan B, or those who signed up for Plan C to Plan D. 6.

Thus, people who subscribed to Plan C or Plan D are more likely to renew. Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dcxxiii 7.

Using Table 4: Chi-Square Test

Observed Frequencies Subscriber Type Selection

Basic

At Large

Add-on

Total

Digital

70

7

75

152

Digital Ad-Free

12

18

9

39

Digital All-Access

5

72

4

81

Print+Digital

13

3

12

28

Total

100

100

100

300

Add-on

Total

Digital 50.66667 50.66667 50.66667

152

Expected Frequencies Subscriber Type Selection

Basic

At Large

Digital Ad-Free

13

13

13

39

Digital All-Access

27

27

27

81

Print+Digital 9.333333 9.333333 9.333333

28

Total

100

100

100

Data Level of Significance

0.05

Number of Rows

4

Number of Columns

3

Degrees of Freedom

6 Copyright ©2024 Pearson Education, Inc.

300


Results Critical Value

12.59159

Chi-Square Test Statistic

178.9467

p-Value

5.68E-36

Reject the null hypothesis

Expected frequency assumption is met.

 2 = 178.9467 > 12.59159. Reject H0. There is evidence of a difference in the proportion of subscriber type based on the selection. 8.

Answers may vary.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dcxxv

Chapter 13

1.

Develop a regression model to predict revenue based on the number of clicks a digital ad generates. Simple Linear Regression Analysis

Regression Statistics Multiple R

0.8527

R Square

0.7270

Adjusted R Square

0.7242

Standard Error

3.7656

Observations

100

ANOVA df

SS

MS

F

Regression

1

3701.0064 3701.0064 261.0091

Residual

98

1389.6012

Total

99

5090.6076

Coefficients

Standard Error

Intercept

9.0847

Clicks

0.1656

Significance F 0.0000

14.1796

t Stat

P-value

1.3018

6.9783

0.0000

6.5013

0.0102

16.1558

0.0000

0.1452

Yˆ  9.0847  0.1656 X , where X = the number of clicks.

Copyright ©2024 Pearson Education, Inc.

Lower 95%


Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dcxxvii 1. cont.

The residual plot reveals no evidence of a pattern in the residuals. There appears to be no violation of the linearity and equal variance assumptions. The regression model to predict revenue: Yˆ  9.0847  0.1656 X , where X = number of clicks. 2.

For 90 clicks, the predicted revenue is Yˆ  9.0847  0.1656(90)  23.98. Copyright ©2024 Pearson Education, Inc.


3.

There are many factors that might be considered.

4.

The value of 525 for X, the number of clicks, is beyond the range of our X values.

5.

Answers will vary. Some possibilities of how a visitor’s engagement could be measured is by the length of time the visitor is at the website, or information about purchases from sponsored content.

6.

Answers will vary.

7.

A generalized prediction line for a simple linear model: Ŷ  b0  b1 X , where X = time spent at website.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dcxxix

Chapter 14

Let Let Y  revenue, X 1 = number of clicks,

X 2  1 if ad appears on website; 0 if ad does not appear on website, X 3  X 1 X 2 Regression Analysis

Regression Statistics Multiple R

0.8546

R Square

0.7304

Adjusted R Square

0.7220

Standard Error

3.7811

Observations

100

ANOVA df

SS

MS

F

Regression

3

3718.1386 1239.3795 86.6908

Residual

96

1372.4690

Total

99

5090.6076

Coefficients

Standard Error

Intercept

10.5607

Clicks

Significance F 0.0000

14.2966

t Stat

P-value

Lower 95%

Upper 95%

1.9959

5.2912

0.0000

6.5989

14.5224

0.1566

0.0153

10.2120

0.0000

0.1261

0.1870

Home Page

-2.5013

2.6467

-0.9451

0.3470

-7.7549

2.7523

Clicks*Home Page

0.0154

0.0207

0.7442

0.4586

-0.0257

0.0565

Copyright ©2024 Pearson Education, Inc.


Testing the significance of the interaction: H 0 : 3  0 vs. H1 : 3  0 Since the p-value of the t-test statistic for the significance of X 3 is 0.4586 > 0.05, do not reject the null hypothesis. There is insufficient evidence of an interaction between number of clicks and the ad appearing on the website.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dcxxxi The Excel results of the multiple regression without the interaction term are as follows: Regression Analysis

Regression Statistics Multiple R

0.8537

R Square

0.7288

Adjusted R Square

0.7232

Standard Error

3.7724

Observations

100

ANOVA df

SS

MS

F

Regression

2

3710.2212 1855.1106 130.3590

Residual

97

1380.3864

Total

99

5090.6076

Coefficients

Standard Error

Intercept

9.5096

Clicks Home Page

Significance F 0.0000

14.2308

t Stat

P-value

Lower 95%

Upper 95%

1.4070

6.7586

0.0000

6.7171

12.3022

0.1650

0.0103

16.0367

0.0000

0.1446

0.1854

-0.6164

0.7660

-0.8047

0.4230

-2.1368

0.9040

Durbin-Watson Calculations

Sum of Squared Difference of Residuals

3409.766552

Sum of Squared Residuals

1380.386434

Copyright ©2024 Pearson Education, Inc.


Durbin-Watson Statistic

2.470153623

Regression Analysis Coefficients of Partial Determination

Intermediate Calculations SSR(X1,X2) 3710.221166 SST

5090.6076

SSR(X2)

50.41387258 SSR(X1 | X2)

3659.807293

SSR(X1)

3701.006381 SSR(X2 | X1)

9.214784733

Coefficients r2 Y1.2

0.72612433

r2 Y2.1

0.006631244

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dcxxxiii

Copyright ©2024 Pearson Education, Inc.


The 5% critical values of the Durbin-Watson statistics are d L  1.65 and dU  1.69 . The Durbin-Watson test statistic = 2.47 > 1.69. There is no evidence of autocorrelation in the data. Testing for the overall significance of the multiple regression: H 0 : 1  2  0 vs. H1 : not all  j = 0. Since the p-value of the overall F test statistic is essentially zero, reject the null hypothesis and conclude that there is evidence that revenue depend on the number of clicks and/or whether the ad appears on the website. Testing for the effect of the individual independent variable on the revenue: H 0 : 1  0 vs. H1 : 1  0 Since the p-value of the t-test statistic for the significance of X1 is essentially zero, reject the null hypothesis and conclude that there is evidence that the number of clicks has significant effect on the revenue. H 0 : 2  0 vs. H1 :  2  0 Since the p-value of the t-test statistic for the significance of X 2 is 0.4230 > 0.05, do not reject the null hypothesis and conclude that there is insufficient evidence that whether the ad appears on the website alone has significant effect on the revenue. There is no pattern in the residuals versus hours or presentation type. The best model to predict the revenue is Yˆ  9.5096  0.1650 X 1  0.6164 X 2 72.61% of the variation in the revenue can be explained by variation in the clicks and whether the ad appears on the website. Holding constant whether the ad appears on the website, 72.61% of the variation in revenue can be explained by variation in clicks. Holding constant the number of clicks, 0.66% of the variation in revenue can be explained by variation in whether the ad appears on the website. Since the regression coefficient for whether the ad appears on the website is negative, this means that holding constant the number of clicks, having the ad on the website is predicted to decrease the revenue by a mean of 0.6164. Holding constant whether the ad appears on the website, for each increase of one click, the mean revenue is predicted to increase by 0.1650.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dcxxxv

Chapter 15

1.

Before beginning, please note that even a math major who is a super-excellent solutions-preparer cannot execute a solution for this case. Once reason is that the scope of the book does not discuss things such as VIFs in something other than an OLS regression. But the concept of multicollinearity does apply. The marketing department clearly seeks a predictive model with a dependent categorical variable, like subscriber, so one is considering a model created by logistic regression analysis. One is given 40 candidate variables. What does one do with them? The case says, ―the department would like to use those fact.‖ One’s question would be, ―Why?‖ Perhaps, these facts were selected because they were easily collectible. Establishing relevant facts would be the first thing to do. This invokes the D in DCOSAC. Note that Exhibit 15.1, which slightly oversimplifies things, invokes DCOSAC in step 1. A complete answer would use DCOSAC as a frame, illustrating what a framework does! As established early in the book (did students forget–it’s chapter 15 and a long semester or two later!), some prior task may need to be redone. For example, while data has already been collected, additional ―facts‖ may need to be collected based on the outcome of the design task. The requirement to ―be specific about methodology‖ is both a reference to problem-solving methodology as well as to inferential methods and techniques that Chapters 14 and 15 discuss. A higher-level student would walk through the things Sections 15.1 and 15.2 discuss. A lower-level answer should acknowledge that ―analyze‖ task is not a simple application of a particular method. Any answer should acknowledge the principle of parsimony in some way.

Copyright ©2024 Pearson Education, Inc.


Chapter 16

1.

For Visitors vs Month time-series, there is a sharp downward trend (an irregular component) followed by a slight upward trend.

Visitors vs Month, MA(3), ES(0.5), ES(0.25) time series, shows the exponential smoothing over the time series.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dcxxxvii

Copyright ©2024 Pearson Education, Inc.


1. For Page Impressions vs Month time-series, there is a sharp downward trend (an irregular cont. component) followed by a slight upward trend.

Page Impressions vs Month, MA(3), ES(0.5), ES(0.25) time series, shows the exponential smoothing over the time series.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dcxxxix 1. For Bounce Rates vs Month time-series, there is a sharp downward trend (an irregular component) cont. followed by a slight upward trend.

Bounce Rates vs Month, MA(3), ES(0.5), ES(0.25) time series, shows the exponential smoothing over the time series.

2.

It would be better if there was more data than just 18 months. There is not enough data to show trends. The sharp downward trend in each of the time-series could be explained and smoothed away with more data.

Copyright ©2024 Pearson Education, Inc.


3.

Number of Visitors: Linear model: Visitors vs Coded Month r2 = 0.3308, adjusted r2 = 0.2890 Coefficients Standard Error

t Stat

P-value

Intercept

767.3918

103.5712

7.4093

0.0000

Coded Month

-29.2487

10.4005

-2.8122

0.0125

Quadratic model: Visitors vs Coded Month, Coded Month Sq r2 = 0.7234, adjusted r2 = 0.6865 Coefficients Standard Error

t Stat

P-value

Intercept

1080.1404

96.5607

11.1861

0.0000

Coded Month

-146.5294

26.3398

-5.5630

0.0001

Coded Month Sq

6.8989

1.4952

4.6140

0.0003

Exponential model: log(Visitors) vs Coded Month r2 = 0.3376, adjusted r2 = 0.2962 Coefficients Standard Error

t Stat

P-value

Intercept

2.8305

0.0621 45.5488

0.0000

Coded Month

-0.0178

0.0062

0.0114

-2.8558

Autoregressive model third order: Visitors vs Lag1, Lag2, Lag3 (rows unused: 3) r2 = 0.0552, adjusted r2 = –0.2025 Coefficients Standard Error t Stat P-value Intercept

410.5708

99.6986

4.1181

0.0017

Lag1

0.1457

0.2729

0.5338

0.6041

Lag2

-0.1391

0.2744

-0.5071

0.6221

Lag3

0.0181

0.1471

0.1231

0.9043

Autoregressive model second order: Visitors vs Lag1, Lag2 (rows unused: 2) r2 = 0.3969, adjusted r2 = 0.3042 Coefficients Standard Error t Stat P-value Intercept

272.4606

64.9359

4.1958

Copyright ©2024 Pearson Education, Inc.

0.0010


Solutions to End-of-Section and Chapter Review Problems dcxli Lag1

0.3848

0.2598

1.4813

0.1624

Lag2

-0.0321

0.1423

-0.2256

0.8250

Autoregressive model first order: Visitors vs Lag1 (rows unused: 1) r2 = 0.7965, adjusted r2 = 0.7829 Coefficients Standard Error t Stat Intercept Lag1

P-value

210.9523

37.4591

5.6315

0.0000

0.4900

0.0639

7.6625

0.0000

The Autoregressive model first order: Visitors vs Lag1 would be best suited for prediction. Yˆi  210.9523  0.4900(Yi 1 )

Copyright ©2024 Pearson Education, Inc.


3. Page Impressions: cont. Linear model: Page Impressions vs Coded Month r2 = 0.3192, adjusted r2 = 0.2767 Coefficients Standard Error Intercept Coded Month

t Stat

P-value

1447.9357

213.5708

6.7797

0.0000

-58.7441

21.4466

-2.7391

0.0146

Quadratic model: Page Impressions vs Coded Month, Coded Month Sq r2 = 0.8288, adjusted r2 = 0.8059 Coefficients Standard Error t Stat

P-value

Intercept

2176.3860

155.3194

14.0123

0.0000

Coded Month

-331.9129

42.3679

-7.8341

0.0000

Coded Month Sq

16.0688

2.4050

6.6813

0.0000

Exponential model: log(Page Impressions) vs Coded Month r2 = 0.2610, adjusted r2 = 0.2148 Coefficients Standard Error t Stat

P-value

Intercept

3.0869

0.0783 39.4423

0.0000

Coded Month

-0.0187

0.0079

0.0303

-2.3769

Autoregressive model third order: Page Impressions vs Lag1, Lag2, Lag3 (rows unused: 3) r2 = 0.2122, adjusted r2 = –0.0027 Coefficients Standard Error t Stat P-value Intercept

572.8549

132.0586

4.3379

0.0012

Lag1

0.1102

0.2546

0.4328

0.6735

Lag2

0.3030

0.2355

1.2869

0.2246

Lag3

-0.1948

0.1489

-1.3081

0.2175

Autoregressive model second order: Page Impressions vs Lag1, Lag2 (rows unused: 2) r2 = 0.5780, adjusted r2 = 0.5130 Coefficients Standard Error t Stat P-value Intercept

398.9237

104.6132

3.8133

Copyright ©2024 Pearson Education, Inc.

0.0022


Solutions to End-of-Section and Chapter Review Problems dcxliii Lag1

0.3909

0.2531

1.5446

0.1464

Lag2

0.0502

0.1749

0.2869

0.7787

Autoregressive model first order: Page Impressions vs Lag1 (rows unused: 1) r2 = 0.7977, adjusted r2 = 0.7842 Coefficients Standard Error t Stat P-value Intercept Lag1

269.0196

88.2314

3.0490

0.0081

0.6187

0.0805

7.6899

0.0000

The Quadratic model: Page Impressions vs Coded Month, Coded Month Sq would be best suited for prediction. Yˆ  2176.3860  331.9129 X  16.0688 X 2 , where X = coded month with month 1 = 0

Copyright ©2024 Pearson Education, Inc.


3. Bounce Rates: cont. Linear model: Bounce Rates vs Coded Month r2 = 0.1788, adjusted r2 = 0.1275 Coefficients Standard Error

t Stat

P-value

Intercept

0.0378

0.0102

3.7129

0.0019

Coded Month

-0.0019

0.0010

-1.8665

0.0804

Quadratic model: Bounce Rates vs Coded Month, Coded Month Sq r2 = 0.3571, adjusted r2 = 0.2714 Coefficients Standard Error t Stat

P-value

Intercept

0.0565

0.0131

4.3257

0.0006

Coded Month

-0.0089

0.0036

-2.5040

0.0243

Coded Month Sq

0.0004

0.0002

2.0398

0.0594

Exponential model: log(Bounce Rates) vs Coded Month r2 = 0.1108, adjusted r2 = 0.0552 Coefficients Standard Error

t Stat

P-value

Intercept

-1.6299

0.1272

-12.8173

0.0000

Coded Month

-0.0180

0.0128

-1.4118

0.1772

Autoregressive model third order: Bounce Rates vs Lag1, Lag2, Lag3 (rows unused: 3) r2 = 0.0478, adjusted r2 = –0.2119 Coefficients Standard Error t Stat P-value Intercept

0.0180

0.0070

2.5876

0.0252

Lag1

-0.0925

0.3034

-0.3048

0.7662

Lag2

0.0146

0.2756

0.0528

0.9588

Lag3

-0.0514

0.0758

-0.6779

0.5118

Autoregressive model second order: Bounce Rates vs Lag1, Lag2 (rows unused: 2) r2 = 0.0835, adjusted r2 = –0.0575 Coefficients Standard Error t Stat P-value

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dcxlv Intercept

0.0171

0.0040

4.3264

0.0008

Lag1

-0.0356

0.2566

-0.1387

0.8918

Lag2

-0.0594

0.0689

-0.8624

0.4041

Autoregressive model first order: Bounce Rates vs Lag1 (rows unused: 1) r2 = 0.2620, adjusted r2 = 0.2128 Coefficients Standard Error t Stat P-value

4.

Intercept

0.0130

0.0020

6.6254

0.0000

Lag1

0.1390

0.0602

2.3076

0.0357

The Quadratic model: Bounce Rates vs Coded Month, Coded Month Sq would be best suited for prediction. Yˆ  0.0565  0.0089 X  0.0004 X 2 , where X = coded month with month 1 = 0 Forecast of next month’s value: Visitors: Yˆ19  210.9523  0.4900(Y18 )  210.9523  0.4900(514)  462.8153

Page Impressions: Yˆ  2176.3860  331.9129 X  16.0688 X 2

 2176.3860  331.9129(18)  16.0688(18)2  1408.2304 Bounce Rates: Yˆ  0.0565  0.0089 X  0.0004 X 2

 0.0565  0.0089(18)  0.0004(18) 2  0.0296  2.96%

Copyright ©2024 Pearson Education, Inc.


Chapter 19

1. Xbar-R Chart of Upload Speed

Sample M ean

1.14

U C L=1.1401

1.08 _ _ X=1.0003

1.02 0.96 0.90

LC L=0.8605 1

3

5

7

9

11

13 Sample

15

17

19

21

23

25

U C L=0.5126

Sample Range

0.48 0.36

_ R=0.2424

0.24 0.12 0.00

LC L=0 1

2.

3.

3

5

7

9

11

13 Sample

15

17

19

21

23

25

Since there are five observations for each day, you should use the X chart in conjunction with the Range chart. Mean and Range Charts: X = 1.0003 R = 0.2424 Range Chart: UCL = 0.5126 LCL does not exist R = .02424 There are no points outside the control limits and no violations of the rules 1 - 5. X Chart: UCL = 1.1401 LCL = 0.8605 X = 1.0003 There are no points outside the control limits and no violations of the rules 1 - 5. The process is stable, so any attempt to reduce the common cause variation in the upload speed or to improve the upload speed must be undertaken by management by changing the process.

Copyright ©2024 Pearson Education, Inc.


Solutions to End-of-Section and Chapter Review Problems dcxlvii

Copyright ©2024 Pearson Education, Inc.


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.