SOLUTIONS MANUAL FOR Analyzing Data and Making Decisions Statistics for Business Microsoft Excel 201 by welldoneassistant

Solutions Manual

For

Analyzing Data and Making Decisions

Statistics for Business Microsoft Excel 2010 Updated Second Edition

Judith Skuce

Part 1: Page 1-454 Part 2: Page 455-732

Contents Part I: Introduction Chapter 1: Using Data to Make Better Decisions................................................................1

Part II: Descriptive Statistics Chapter 2: Using Graphs and Tables to Describe Data .......................................................9 Chapter 3: Using Numbers to Describe Data.....................................................................51

Part III: Building Blocks for Inferential Statistics Chapter 4: Calculating Probabilities ..................................................................................65 Chapter 5: Probability Distributions ..................................................................................88 Chapter 6: Using Sampling Distributions to Make Decisions .........................................108

Part IV: Making Decisions Chapter 7: Making Decisions with a Single Sample .......................................................129 Chapter 8: Estimating Population Values ........................................................................161 Chapter 9: Making Decisions with Matched Pairs Samples, Quantitative or Ranked Data..........................................................................194 Chapter 10: Making Decisions with Two Independent Samples, Quantitative or Ranked Data..........................................................................234 Chapter 11: Making Decisions with Three or More Samples, Quanitative Data—Analysis of Variance (ANOVA) ....................................275 Chapter 12: Making Decisions with Two or More Samples, Qualitative Data .............................................................................................321

Part V: Analyzing Relationships Chapter 13: Analyzing Linear Relationships, Two Quantitative Variables ....................351 Chapter 14: Analyzing Linear Relationships, Two or More Variables ...........................389

Instructor’s Solutions Manual - Chapter 1

Chapter 1 Solutions Develop Your Skills 1.1 1. You would have to collect these data directly from the students, by asking them. This would be difficult and time-consuming, unless you are attending a very small school. You might be able to get a list of all the students attending the school, but privacy protection laws would make this difficult. No matter how much you tried, you would probably find it impossible to locate and interview every single student (some would be absent because of illness or work commitments or because they do not attend class regularly). Some people may refuse to answer your questions. Some people may lie about their music preferences. It would be difficult to solve some of these problems. You might ask for the school's cooperation in contacting students, but it is unlikely they would comply. You could offer some kind of reward for students who participate, but this could be expensive. You could enter participants' names in a contest, with a music-related reward available. None of these approaches could guarantee that you could collect all the data, or that students would accurately report their preferences. One partial solution would be to collect data from a random sample of students, as you will see in the discussion in Section 1.2 of the text. Without a list of all students, it would be difficult to ensure that you had a truly random sample, but this approach is probably more workable than a census (that is, interviewing every student). 2.

Because you need specific data on quality of bicycle components, you would need to collect primary data. Customer complaints about quality are probably the only source of secondary data that you would have.

Statistics Canada has a CANSIM Table 203-0010, Survey of household spending (SHS), household spending on recreation, by province and territory, annual, which contains information on purchases of bicycles, parts and accessories. There is a U.S. trade publication called "Bicycle Retailer & Industry News", which provides information about the industry. See http://www.bicycleretailer.com/. Access is provided through the Business Source Complete database. Industry Canada provides a STAT-USA report on the bicycle industry in Canada, at http://strategis.ic.gc.ca/epic/internet/inimr-ri.nsf/en/gr105431e.html. Somewhat outdated information is also available at http://www.ic.gc.ca/eic/site/sgas.nsf/eng/sg03430.html. Canadian Business magazine has a number of articles on the bicycle industry. One of the most recent describes the purchase of the Iron Horse Co. of New York by Dorel Industries (a Montreal firm). http://www.canadianbusiness.com/markets/headline_news/article.jsp?content=b1560 9913

Instructor’s Solutions Manual - Chapter 1

Although Statistics Canada takes great care in its data collection, errors do still occur, and data revisions are required. An interesting overview of GDP data quality for seven OECD countries is available at http://www.oecd.org/dataoecd/20/26/34350524.pdf You should be able to locate other information about data revisions. See also http://www.statcan.ca/english/about/policy/infousers.htm which describes Statistics Canada’s policy on informing users about data quality.

At least some of the secondary data sources listed in Section 1.1 should help you. If you cannot locate any secondary data, get help from a librarian.

Develop Your Skills 1.2 6. The goal for companies is to create population data, but it is unlikely that every customer is captured in any CRM database. There are many examples of companies using CRM data. A search of the CBCA database on August 7, 2009 produced a list of 102 articles (for 2009) that contained “customer relationship management” as part of their citation and indexing. For example, the publication called "Direct Marketing" regularly writes about database marketing, data mining, and web analytics. See http://www.dmn.ca/index.html. 7.

This is a nonstatistical sample, and could be described as a convenience sample. The restaurant presumably has diners on nights other than Friday, and none of these could be selected for the sample. The owner should not rely on the sample data to describe all of the restaurant's diners, although the sample might be useful to test reaction to a new menu item, for example.

These are sample statistics, as they are based on sample data. It would be impossible to collect data from all postsecondary students.

Follow the instructions for Example 1.2c. The random sample you get will be different, but here is one example of the 10 names selected randomly. AVERY MOORE EMILY MCCONNELL HARRIET COOGAN DYLAN MILES TERRY DUNCAN GEORGE BARTON JAMES BARCLAY AVA WORTH PAIGE EATON JORDAN BOCK

10. First, Calgary Transit will probably find it impossible to establish a frame for its target population, which is people with disabilities who use Calgary Transit. It will also have to carefully define what it means by “people with disabilities”. If this

Instructor’s Solutions Manual - Chapter 1

means “people in wheelchairs”, then it will at least be possible to identify such riders when an interviewer visits a bus or a bus stop. However, it will be quite difficult for Calgary Transit to obtain a truly random sample of the opinions of people in wheelchairs who use Calgary Transit’s services. Coverage errors will be practically unavoidable. As well, if interviewers are approaching only those riders in wheelchairs, the survey respondents may be unhappy about being singled out because of their wheelchairs. They may refuse to answer the interviewer’s questions, leading to nonresponse errors. Interviewers will have to be trained carefully to overcome any resulting resistance of survey subjects. Because data will probably be collected on buses or at bus stops, with interviewers recording information while a bus is in motion, or possibly during bad weather at a bus stop, processing errors may occur. Finally, Calgary Transit will have to be sure that suitably qualified people are doing the analysis, to avoid estimation errors. Develop Your Skills 1.3 11. This is impossible. A price cannot decrease by more than 100% (and a 100% decrease would mean the price was 0). It is likely that the company means that the old price is 125% of the current price. So, for example, if the old price was $250, then the new price would be $200. You can see that 250/200 = 1.25 or 125%. 12. The graph with the y-axis that begins at 7,000 is misleading, because it makes the index decline at the end of 2008 look more dramatic than it actually was. While the fall in the stock market index was significant, using a y-axis that begins with zero puts it in better perspective. 13. “Jane Woodsman’s average grade has increased from 13.8% last semester to 16.6% this semester.” The provocative language of the initial statement (“astonishing progress”, “substantial 20%”) is inappropriate. As well, the 20% figure, used as it is here, suggests something different from the facts. Jane’s grades did increase by 20%, but this is only 20% of the original grade of 13.8%, so it is not much of an improvement. 14. Aside from the fact that you should be suspicious of anyone who will not share the actual data with you, the local manager’s assurance that all is well may not be borne out by fact. Notice that the decrease claimed is in maximum wait times, not average wait times. It is entirely possible that average waiting times have increased. You need to see the data! 15. Yes. There is no distortion in how the data are represented, and the graph is clearly labelled and easy to understand. Develop Your Skills 1.4 16. No! With such an observational study, this kind of conclusion about cause and effect is not justified. There could be many factors (other than income) that explain why children in wealthier families are better off. For example, parents in wealthier

Instructor’s Solutions Manual - Chapter 1

families may have more confidence, and this may provide a very positive environment in which children flourish. 17. No. Even if the study was randomized (no information is provided), it would not be legitimate to make this conclusion. While taller men are more likely to be married, we cannot conclude that they are more likely to be married because they are taller. There could be many other factors at work. 18. It may be that the diary system contributed to increased sales. Because the data compares the same people before and after use of the diary system, there is some support for this conclusion. However, notice that only poor performers were selected for the trial. These people may have worked harder simply because it was clear their poor performance had been noticed. 19. If you had compared the sales performance of a randomly-selected group of salespeople (not only poor performers), you would be able to come to a stronger conclusion about the diary system’s impact on increased sales. 20. There may be a cause-and-effect relationship here, but any conclusions should be made cautiously. For example, hotter weather or a nearby fair for children could have increased foot traffic (and sales) during the period. Develop Your Skills 1.5 21.a. In this case, the national manager of quality control probably has a good grasp of statistical approaches. While you should still strive for clarity and simplicity, you can include more of the technical work in the body of the report. Printouts of computer-based analysis would be included in the appendix. b. While human resources professionals probably have some understanding of statistical analysis, they are less likely to understand the details. In this case, you should write your report with a minimum of statistical jargon. The body of your report should contain key results, but the details of your analysis should be saved for the appendix. c. In this case, you can assume no statistical expertise in your readers. While you should still report on how your analysis was conducted, and how you arrived at your conclusions, you probably would not send this part of the report to your customers. The report you send your boss should be easily understandable to everyone. The challenge here will be to make your conclusions easy to understand, while not oversimplifying, or suggesting that your results are stronger than they actually are. 22. It is incorrect to suggest that study of a random sample “proves” anything. This statement is much more definitive than can be justified. As well, the study was done on past customers, and may not apply to future customers. Nevertheless, such a study could be persuasive about what segment of the market the company should focus on.

Instructor’s Solutions Manual - Chapter 1

23. The average amount of paint in a random sample of 30 cans was 3.012 litres, compared with the target level of 3 litres. This sample mean is within control chart limits, indicating no need to adjust the paint filling line. 24. Analysis of output for a random sample of 50 workers showed an increase in worker output from 52 units per hour, on average, to 56 units per hour after the training. This evidence is sufficient to suggest that the worker output increased after training. This may mean that the training caused the increase in output, but this can be determined only after an examination of the circumstances, to determine if there were other possible causes of the increase in output. 25. In fact, some studies have shown that there is a positive relationship between height and income (that is, taller people tend to have higher incomes). However, all such studies must be observational (there is no ethical way to control height!), and so the cause-and-effect conclusion suggested here is not valid. The statement could be rewritten as follows: “A study has shown a strong positive relationship between height and income.” You might even go on to discourage the unsophisticated reader from jumping to conclusions, as follows: “Of course, this should not be interpreted as meaning that greater height guarantees higher income, or that you cannot earn a high income if you are short.” Chapter Review Exercises 1. Collecting data usually leads to a better understanding of the question and a better decision. 2.

Businesses may not have all of the data available, because it is impossible or too costly to collect. For example, it is unlikely that a business would have detailed information available about every customer. However, reliable decisions can be made on the basis of detailed information about a random sample of customers.

It is generally not valid to draw conclusions about a population on the basis of a convenience sample. Because there is no way to estimate the probability of any particular element of the population being selected for a convenience sample, we cannot control or estimate the probability of error.

Students often use their average grade to summarize their performance in a semester.

Decision-makers may not be statisticians. Statistical analysis is powerful only if it is communicated so that those making the decision can understand the story the data are telling.

One of the difficulties with gathering data through personal interviews is that those being surveyed are sometimes charmed by the interviewer. This sometimes results in interviewees acting to please the interviewer, rather than providing honest or informed responses to questions. Respondents can also be misled by the interviewer, as Rick Mercer so successfully illustrates in the “Talking to Americans” segments.

Instructor’s Solutions Manual - Chapter 1

For example, Rick Mercer managed to persuade Americans to congratulate Canadians on legalizing VCRs. 7.

It is possible to make reliable conclusions on the basis of 1,003 responses. The sample size may seem small, when you consider there are millions of adult Canadians who have retirement funds. However, the sample size required depends not on the size of the population, but on its variability. If all Canadians were exactly the same, a sample of one would be sufficient. The more variable Canadians are, the larger the sample size required for estimates of a desired level of accuracy. This is something you will explore more in Chapter 8.

No. The applications are filled out by employers, not employees. Employers have a vested interest in portraying themselves as “top” employers. The sample is not representative. The Top 100 are selected from a self-selected sample.

Done correctly, such a study could identify a positive relationship between height and income. However, it is not correct to suggest the study means that height is the cause of the differences in incomes, although if the study was well-designed, other causes could have been randomized. Still, the leap to evolution as the explanation is not at all justified by the study. While this may be the explanation, it is only one of many possible explanations.

10. You collected your sample results at only one physical location at the school. Since it was near the smoking area, it is likely that your sample contained a disproportionate number of smokers. It is likely that smokers’ opinions about the new smoking policy would differ from nonsmokers’ opinions. Your sample is almost certainly not representative of the entire school community. 11. There are many examples of loyalty programs: Airmiles, President’s Choice Financial rewards, American Express rewards, HBC rewards, PetroCanada’s PetroPoints, Sears Club points, Aeroplan. Enter “loyalty rewards programs” into an Internet search engine, and you will find references to many such programs. 12. This is an observational study. It is not possible to draw a strong conclusion that drinking alcohol causes higher earnings. In fact, the causation may run the other way: higher income may cause more drinking (because those with higher incomes can afford to drink alcohol). The study measured income, not social networks, so the explanation provided is speculative. As well, there are other factors that could explain the differences in incomes, and these were not controlled in the study. 13. The article describes the New Coke story as "the greatest marketing disaster of all time". The research failed to uncover the attachment people felt to the original Coke. A question such as "Would you switch to the New Coke?" might have revealed how loyal customers were to the original Coke.

Instructor’s Solutions Manual - Chapter 1

14. The Conference Board included the detail so that anyone reading the study could draw their own conclusions about the reliability of the data, and possible biases in the study results. 15. While the title used in the report is accurate, the percentage decrease is relatively small, and the actual number of drivers has increased. The title could easily be misunderstood. 16. The author has used Excel to generate Lotto 6/49 quick picks, but she didn't win the lottery either! 17. Of course, because your samples are randomly-selected, your samples cannot be predicted. When the author did this exercise, the sample averages were as shown in the table below. The population average is 65.0. The sample averages ranged from 60.7 to 73.7. Some were quite close to the population average, and the largest difference between a sample average and the true value was 8.7.

Average Mark from 10 Randomly-Selected Samples of Size 10 60.7 61.8 65.5 65.6 67.4 69.8 70.3 71.3 73.0 73.7

18. Again, the values you obtain cannot be predicted. These averages should be closer to the true population average, because they are based on more data (15 data points instead of 10). The author's results are shown in the table below. In general, the sample averages are closer to the true population average. The largest difference between a sample average and the true value is -6.4.

Average Mark from 10 Randomly-Selected Samples of Size 15 58.6 61.2 62.6 64.0 66.2 66.9 67.2 67.7 68.4 70.8

19. Based on the author's results for exercises 17 and 18 above (yours will be different): the average of the sample averages, when the sample size is 10, is 67.9. The average of the sample averages, when the sample size is 15, is 65.4. The average of the sample averages is closer to the true population average value when the sample size is larger. You will investigate this more in Chapter 6.

Instructor’s Solutions Manual - Chapter 1

20. a. Because students are generally quite mobile, often moving from the place where they attended to school to their place of work after graduating, it may be quite difficult to contact them for the graduation employment and satisfaction measures. Some graduates will almost certainly be missed, and if their opinions differ from those surveyed, the results will not be truly representative. The same kinds of problem arise with surveys of current students. Such students are surveyed in their classes, so those who are absent on the day of the survey are missed. It could be that the opinions of those who are absent are different from the opinions of those who are present in class (for example, it could be the case that students are absent because they do not find their classes relevant to their future careers). Employers may also be missed, so the potential for coverage errors exists throughout. Of course, as with any survey, all the other nonsampling errors are possibilities: nonresponse errors, response errors, processing and estimation errors. This is a large undertaking, and so there are more possibilities for error. b.

Although it could be the case that all colleges improved their services significantly between 1999-2000 and later years, this seems unlikely. There may be another explanation for the shift in the percentage of students satisfied with college services. In fact, a closer inspection reveals that about a half-dozen colleges improved their ratings quite significantly. Without more information, there is no way to know why this happened.

The report uses summary measures (percentages of responses for each category, for each year) and graphs (line graphs and bar graphs) to summarize the data.

In 1999-2000, an additional capstone question was included in the calculation. Since this question was not used in subsequent years, the average student satisfaction rates are not directly comparable.

Instructor’s Solutions Manual - Chapter 2

Chapter 2 Solutions Develop Your Skills 2.1 1. The number of dented cans is a count of qualitative data (think of it this way—the original data might be recorded as “yes” or “no” to the question: is the can dented?). These are also time-series data, as they are collected over successive periods of time. The data are discrete. 2.

Stock price data are quantitative data. These are also time-series data, as they were collected over three years. Prices are treated as continuous data.

The employees' final average grades from college are continuous quantitative data. The scores assigned by the supervisors are ranked data. Although the questionnaire will help, such a ranking is somewhat subjective. If different supervisors assign the ranks, they may not be comparable.

The price data are continuous quantitative data. They are also organized according to qualitative data on the size of the coffee. These are cross-sectional data, as they would be collected at around the same period in time.

Postal codes are qualitative data.

Instructor’s Solutions Manual - Chapter 2

Develop Your Skills 2.2 6. The three different histograms are shown below.

Survey of Drugstore Customers: Customer Ages 18 16

Number of Customers

14 12 10 8 6 4 2 0

Age of Customers

Survey of Drugstore Customers: Customer Ages Number of Customers

30 25 20 15 10 5 0

Age of Customers

Survey of Drugstore Customers: Customer Ages 35

Number of Customers

30 25 20 15 10 5 0

Age of Customers

Instructor’s Solutions Manual - Chapter 2

All three histograms clearly show that the distribution of customer ages is skewed to the right, that is, while most customers are under 40 years old, there are some customers who are much older, in fact as old as 85. [Note that when you are describing the distribution, it is not sufficient to stop at “skewed to the right”—you should explain what this means, in the context of this particular data set.] A class width of 5 is not a good choice for this data set. There are too many classes, many of which have only a very few data points. A class width of 10 or 15 would be a better choice. 7.

A frequency distribution and histogram are shown below.

Survey of Drugstore Customers Customer Income Number of Customers $30,000 to <$35,000 2 $35,000 to <$40,000 9 $40,000 to <$45,000 14 $45,000 to <$50,000 7 $50,000 to <$55,000 5 $55,000 to <$60,000 6 $60,000 to <$65,000 3 $65,000 to <$70,000 4

Survey of Drugstore Customers: Customer Incomes 16 14

Number of Customers

12 10 8 6 4 2 0

Customer Income

Choosing class widths is a bit tricky in this instance. The class width template suggests class widths of 5155, 8905, or 8170. None of these numbers is that comfortable for incomes. Class widths of $5,000 and $10,000 were considered. A class width of $10,000 was discarded because it would have resulted in only four classes (five is a good minimum number of classes).

Instructor’s Solutions Manual - Chapter 2

The distribution of customer incomes in the drugstore survey is skewed to the right. Most customer incomes are in the $35,000 to $50,000, but there are a number of customers with higher incomes, the highest being $68,800. 8.

The stem and leaf display is shown below.

1 2 3 4

9 4 1 0

7 6 1 1

4 3 4

8 9 1

2 4 6

7 9

The lowest daily customer count in the random sample from Downtown Automotive is 12, and the largest is 41. Most days the shop deals with 20-some customers. It is unusual for the shop to deal with more than 35 customers. 9.

This histogram totally fails at its job of summarizing the accompanying data set. 1. The graph does not have meaningful titles or labels. It completely fails to communicate what it’s about. 2. There are gaps between the bars, which there should not be. 3. It appears that the creator of this graph used bin numbers correctly, but s/he forgot to round them for presentation. The graph should show lower class limits along the x-axis in the proper location, that is, aligned under the left-hand side of each bar.

Instructor’s Solutions Manual - Chapter 2

10.

These graphs have not been properly set up, so Patty probably deserved her low mark in Statistics. The titles are not correct. They are not Ms. Nice’s marks, they are marks from a random sample of students in Ms. Nice’s statistics class, and the title should tell us that. The title for the marks from Mr. Mean’s class should be similarly adjusted. In both graphs, the label on the x-axis should say something like “Final Grade”. The label on the y-axis should say something like “Number of Students”. In the graph of the marks from Ms. Nice’s class, the labels are at the centre of each class, and should be adjusted. Horizontal grid lines would also help the reader. The graphs should be set up properly for comparison, with the same classes and scales on each axis. A quick glance at these graphs might lead you to think that the marks are lower in Mr. Mean’s class, but in fact, the opposite is the case.

Instructor’s Solutions Manual - Chapter 2

Develop Your Skills 2.3 11. Either a bar graph or a pie chart would be appropriate.

Number of Customers

Survey of Drugstore Customers: Speed of Service Ratings 20 18 16 14 12 10 8 6 4 2 0 Excellent

Good

Fair

Poor

Rating

Survey of Drugstore Customers: Speed of Service Ratings Poor 18%

Excellent 6%

Good 38%

Fair 38%

The graphs indicate that over a third of customers (38%) rated the speed of service as good, but only 6% rated it as excellent. Over a third of customers (again, 38%) rated the speed of service as only fair, while 18% rated it as poor. These ratings indicate that there may be some room for improvement in speed of service at the drugstore.

Instructor’s Solutions Manual - Chapter 2

12. The most effective graph would be a bar chart, showing actual and desired relative frequencies for each colour. First, create a table for the data, and then create the bar chart.

Actual and Observed Colours in Candy Red Green Blue Observed sample values Desired Percentage Sample Percentage

305

265

Yellow

201

40% 30% 20% 10% 35.179% 30.565% 23.183% 11.073%

Candy Colours, Random Sample After Reorganization of Production Process 45% 40% 35% 30% 25% 20% 15% 10% 5% 0%

Desired Percentage Sample Percentage

Red

Green

Blue

Yellow

This graph makes it easy to see that the most important differences in the candy colour distribution are in the red candies (fewer than desired) and in the blue candies (more than desired).

Instructor’s Solutions Manual - Chapter 2

13. Since we want to compare the number of defects by shift, it is appropriate to compare the categories for the number of defects across the horizontal axis. Since each shift produced a different number of items 1 , it makes sense to use relative frequencies. For each shift, calculate the percentage of items with no defects, one minor defect, more than one minor defect, and then create a bar graph.

Percentage of Total Number of Items Produced

Defects Observed at a Manufacturing Plant, by Shift 100% 8:00 a.m. – 4 p.m.

90% 80% 70%

4:00 p.m. – midnight

60% 50% 40%

midnight – 8:00 a.m.

30% 20% 10% 0% Items With No Apparent Defects

Items With One Minor Defect

Items With More Than One Minor Defect

Across all three shifts, the percentage of items produced with more than one minor defect is small. For all three shifts, by far the greatest percentage of items produced has no apparent defects. The midnight-8:00 a.m. shift has the greatest percentage of defects, and the 4:00 p.m. – midnight shift has the lowest percentage of defects.

This too is interesting, and the fact should be included in any accompanying report.

Instructor’s Solutions Manual - Chapter 2

14.

Financial Services Company Customer Survey: "The staff at my local branch can provide me with good advice on my financial affairs." Number of Customers

60 50 40 30 20 10 0 Strongly Agree

Agree

Neither Agree nor Disagree

Disagree

Strongly Disagree

The majority of the customers surveyed agree with the statement that staff at the local branch can provide good advice on financial affairs. A significant number of the customers neither agreed nor disagreed with the statement, and it might be worthwhile to investigate why these customers appeared to have no opinion (was it lack of knowledge?). There were customers who disagreed or strongly disagreed. It might be worthwhile to investigate further (why was this the case? Were these customers disappointed in past advice, or do they just have an impression that local staff cannot provide good advice?).

Instructor’s Solutions Manual - Chapter 2

15.

Survey of a Random Sample of People Walking Around Kempenfelt Bay 14

Number of People

12 10 8 6 4 2 0 Vanilla

Chocolate Strawberry Maple Chocolate Pralines Walnut Chip

Other

Favourite Flavour of Ice Cream

Vanilla and chocolate were tied as the most frequently-mentioned favourite flavours of ice cream among the people surveyed at Kempenfelt Bay, followed by chocolate chip. The fourth most popular flavour was strawberry. Only a few people cited maple walnut as their favourite flavour, and no one called pralines their favourite. One person had a favourite flavour other than the ones cited specifically in the survey. Of course, there are other options for this graphical display. It might be helpful to arrange the categories from most preferred to least-preferred. As well, a pie graph is an option.

Survey of a Random Sample of People Walking Around Kempenfelt Bay 14

Number of People

12 10 8 6 4 2 0 Vanilla

Chocolate Chocolate Strawberry Maple Chip Walnut

Other

Pralines

Favourite Flavour of Ice Cream

Instructor’s Solutions Manual - Chapter 2

Survey of a Random Sample of People Walking Around Kempenfelt Bay Other 2%

Pralines 0%

Chocolate Chip 21%

Vanilla 28%

Maple Walnut 4% Strawberry 17%

Chocolate 28%

Instructor’s Solutions Manual - Chapter 2

Develop Your Skills 2.4 16. Automobile sales are seasonal and cyclical, although this may not be as much the case as it once was. Sales tend to be higher when new models become available, and generally, auto sales are lower at year-end, when many people are focused on the holiday season. For these reasons, monthly sales data would be appropriate. Annual data would hide the month-to-month variations in sales. 17. Whatever your data source (as long as the data are accurate) you should see that over this period, the price of $1US in Canadian dollars was on an increasing trend from January 2000 until the beginning of 2002, with the highest exchange value of 1.599618 (monthly average) in January of 2002. From then until near the end of 2007, the exchange value of the US dollar in terms of Canadian dollars was on a declining trend, reaching 0.968 in November of 2007. The rate then stabilized around par, beginning to increase in the latter part of 2008, ending with a monthly average rate of 1.2343619 in December. The graph below shows the trends.

Instructor’s Solutions Manual - Chapter 2

18. The Bank of Canada Bank Rate was at 4.5% in January 2007, and stayed there until July of 2007, when it increased to 4.75%. The rate declined to 4.5% in December of 2007, and continued to decline to 3.25% in April of 2008. The rate held steady at 3.25% until October of 2008, when it declined to 2.5%, further falling to 1.75% in December of 2008. Usually a line graph would be used for such a long time series. However, in this case, the movements in the Bank Rate are infrequent, and small, so the graph is not too cluttered. The advantage in using a bar chart is that it highlights the change in rates from one period to the next. 19. Your commentary should describe the data for the company you chose. Here is a checklist to help you: - Be sure to note the start and end dates for the data. - Comment on at least a couple of specific values in the data set (e.g., the high and low for the period). - Keep your language objective and descriptive. Do not leap to any conclusions about why the data might look the way they do.

Instructor’s Solutions Manual - Chapter 2

20.

Computer Price Index, Canada, Consumers (2002 = 100) 140 Computer Price Index

120 100 80 60 40 20 Jan‐08

Jul‐07

Jan‐07

Jul‐06

Jan‐06

Jul‐05

Jan‐05

Jul‐04

Jan‐04

Jul‐03

Jan‐03

Jul‐02

Jan‐02

Your graph should look something like this. Be sure that it is labelled completely and correctly. For example, it is important to indicate the base year for any price index. Your commentary should note that this price index has declined significantly since 2002, with the index hitting a low of 21.91 in May of 2008. The decline in the price index was most pronounced at the beginning of the period, in 2002. The rate of the decline in this price index has slowed somewhat at the end of the period (late 2007 and early 2008).

Instructor’s Solutions Manual - Chapter 2

Develop Your Skills 2.5 21.

Monthly Spending on Restaurant Meals

Spending on Restaurant Meals and Income $250 $200 $150 $100 $50 $‐ $‐

$1,000

$2,000

$3,000

$4,000

$5,000

Monthly Income

There appears to be a slight positive relationship between monthly income and monthly spending on restaurant meals, that is, the higher the monthly income, the greater the monthly spending on restaurant meals. However, there is a great deal of variability in the spending on restaurant meals, and the relationship is weak.

Instructor’s Solutions Manual - Chapter 2

22.

Jack's Cookies, Daily Sales 70

Quantity Sold

65 60 55 50 45 $0.40

$0.50

$0.60

$0.70

$0.80

$0.90

$1.00

$1.10

Price per Cookie

Note this graph shows the data with the explanatory variable (price) on the x-axis, and the response variable (quantity sold) on the y-axis, which matches convention. However, if you have taken economics, you might recognize this as a demand curve. For historical reasons, a demand curve is normally graphed with price on the y-axis. So, it is also acceptable to graph these data as follows:

Jack's Cookies, Daily Sales $1.10

Price per Cookie

$1.00 $0.90 $0.80 $0.70 $0.60 $0.50 $0.40 45

Quantity Sold

Notice that in both graphs, the axes do not begin at (0, 0). It is reasonable to scale the axes as shown, but this should always be clearly indicated. The data for daily sales of Jack’s Cookies show that the quantity sold and the price are negatively related, that is, the higher the price, the lower the quantity sold. This conforms to the Law of Demand.

Instructor’s Solutions Manual - Chapter 2

23.

Survey of Drugstore Customers Amount of Most Recent Purchase

$45 $40 $35 $30 $25 $20 $15 $10 $5 $‐ $20,000 $30,000 $40,000 $50,000 $60,000 $70,000 $80,000 Customer Annual Income

Semester Average Mark (%)

Hours of Work and Semester Marks, Random Sample of Students 100 80 60 40 20 0 0

100

200

300

400

Total Hours of Paid Work During Semester

There appears to be a negative relationship between the total hours of paid work during the semester, and the semester average mark, that is, the greater the hours of work, the lower the semester average mark.

Instructor’s Solutions Manual - Chapter 2

25. Exhibit 2.70c is not correct, because the explanatory variable is years of service, and it should be graphed on the x-axis. Exhibit 2.70b is probably not correct, because it depicts a negative relationship, that is, those with more years of service earn lower salaries. Exhibit 2.70a is the only possible choice, as it shows higher salaries associated with longer years of service. Develop Your Skills 2.6 26. The first obvious problem with this graph is the 3-D aspect. It makes it hard to read the height of the bars. It is not clear if the bars for “good” and “fair” are the same height. The graph, with the 3-D aspect, is an appropriate way to represent these data. The graph would be improved if the 3-D aspect removed, as shown below.

Speed of Service Ratings, Survey of Drugstore Customers 20

Number of Customers

18 16 14 12 10 8 6 4 2 0 Excellent

Good

Fair

Poor

Another possible improvement would be to calculate relative frequencies for each rating category.

Instructor’s Solutions Manual - Chapter 2

27. The pictograph looks as follows:

The year 2000 dollar is worth just under half of the value of the 1980 loonie, but the total area of the year 2000 loonie is only about a quarter of the area of the year 1980 loonie, so the pictograph misleads the viewer. When the loonie is shrunk, it shrinks not only in height (as a bar in a bar graph would), it is also shrunk in width, so the image is not distorted. This decreases the area disproportionately. 28. This is a good graph. The labels and titles are clear, and the graph can be understood without reference to anything else. We can see that there appears to be a positive relationship between the total monthly sales for Hendrick Software salespeople and the number of sales contacts during the previous month. 29. This graph cannot be interesting, because we have no clue what it is about. We can see that the distribution is skewed, but that’s all. With no title, and no meaningful labels on the axes, the graph is useless.

Instructor’s Solutions Manual - Chapter 2

30. There are quite a few categories in this data set, and a bar graph would be preferred. Also, the title is not correct. The survey was of favourite flavours of ice cream (not people). The labels on the pie slices contain the code for the flavour, which is unnecessary and only serves to clutter up the graph. There is also a spelling mistake in one of the labels: “maple walnut”. A better graph is shown below. It has the advantage of sorting the flavours from most favourite to least favourite.

Survey of Favourite Ice Cream Flavours, Random Sample of People Walking Around Kempenfelt Bay Number of People

12 10 8 6 4 2

Other

Maple Walnut

Vanilla

Pralines and Cream

Chocolate Chip

Chocolate

Strawberry

Instructor’s Solutions Manual - Chapter 2

Chapter Review Exercises 1a. These data are qualitative, unranked, and cross-sectional. 1b. These data are qualitative, unranked, and cross-sectional. 1c. These data are quantitative, discrete, and cross-sectional. 1d. These are ranked qualitative data. 1e. These are time-series continuous quantitative data. 2a. A double bar graph could show males and females along the x-axis, with two bars above, one for those with fitness club membership, one bar for those without. Alternatively, categories of fitness club membership could show along the x-axis, with bars for males and females above each category. 2b. A bar graph could be organized with the four store locations along the x-axis, and bars above, each one corresponding to the type of payment. Alternatively, the payment types could show along the x-axis, with four bars above each, one for each store location. 2c. If the total number of pedestrians is recorded, there are only two data points, the number of people who passed by each location. A graph would not really add much to a simple table displaying these numbers, with a proper title and headings. 2d. A double bar graph could be used, with the ratings ("barely edible" to "absolutely delicious") showing along the x-axis, and two bars above each rating, one for each chef. 2e. It is likely that there is interest in the relationship between sales and advertising. A scatter diagram would be appropriate, with advertising along the x-axis, and sales on the y-axis. 3.

These graphs are meant to be amusing and entertaining. Quirky images and bright colours make them attractive, but they are not good examples of graphs to summarize data.

Instructor’s Solutions Manual - Chapter 2

The stem and leaf display is shown below. The order of the leaves in your display may be different, if you went through the data by rows instead of columns.

0 1 2 3 4 5 6

9 9 2 1 3 8 4

8 2 1 2 0 0

8 2 0 0

9 3 0

The data set is skewed to the right. Those who are under 30 years old are the largest age groups in the sample. There are only two people in their 40’s and two in their 50's, and only one in the 60’s. 5.

It appears that there is a positive correlation between months of experience in the company, and salary, but the correlation does not appear to be particularly strong. In fact, the two observations with the highest salaries give the appearance of a positive correlation, and without them, there is no obvious relationship.

Since these are quantitative data, histograms are required. Since the data are quite different in range, it is a challenge to decide what class width to use. The graphs below show a class width of $10, which is probably too wide for the data for purchases by males, but allows comparison with the purchases by females. Note that it might have been wise to use relative frequencies, rather than frequencies, to make this comparison, since the data sets have different sizes. However, the sample sizes differ by only one, so it is not crucial to do this here.

Instructor’s Solutions Manual - Chapter 2

Music Store Purchases by Females 9 8 Number of Purchases

7 6 5 4 3 2 1 0

Value of Purchase

Music Store Purchases by Males 9 8 Number of Purchases

7 6 5 4 3 2 1 0

Value of Purchase

The histograms show that there is more variability in the music store purchases by females. As well, there are more purchases of higher value for females than males. The purchases by males are in the $10-$50 range, while the purchases by females are in the $10-$60 range.

Instructor’s Solutions Manual - Chapter 2

There are two possible graphical displays, a bar chart or a pie chart. Both are shown below. The pie chart has been formatted for black and white printout.

Ratings from a 360 Degree Review for a Trainee Number of Ratings

5 4 3 2 1 Best Possible Performance

Very Good, Very Little Improvement Required

Good

Acceptable

Poor, With Major Improvement Required

Worst Possible Performance

Ratings from a 360 Degree Review for a Trainee Best Possible Worst Possible Performance Performance 6% 7% Very Good, Very Little Improvement Required 20%

Good 27%

Poor, With Major Improvement Required 27%

Acceptable 13%

Instructor’s Solutions Manual - Chapter 2

An appropriate graphical display is shown below.

Employee Ratings of Previous and Current Presidents Number of Ratings

4 Ratings for the Previous President's Performance Ratings for the Current President's Performance

3 2 1 0 1

1= Worst Performance, 10= Best Possible Performance

The performance ratings for the new president are generally lower than for the previous president. However, there seems to be great variability in the ratings for both presidents.

Instructor’s Solutions Manual - Chapter 2

In this case, since the number of students in each sample is the same, it is appropriate to compare the number of students directly. An appropriate graph is shown below. The graph shows that the B.C. students were much more likely to rate this university as “excellent” than the Ontario students, with Ontario students much more likely to rate it as “poor”. The Ontario and B.C. students have different opinions about this university.

Ratings of a Canadian University by Ontario and BC Students 9

Ontario Student Ratings

Number of Students

8 7

BC Student Ratings

6 5 4 3 2 1 0 Excellent

Good

Fair

Poor

Instructor’s Solutions Manual - Chapter 2

10. A graph to summarize the data is shown below.

Payments by Type at Four Store Locations 60

Percentage of Payments

50 40 Cash/Debit Card

Credit Card Cheque

20 10 0 Store A

Store B

Store C

Store D

In this case, since the total number of payments is different at the stores, percentage of payments is displayed on the graph, so that the values are directly comparable. The graph shows that the percentage of payments by cash or debit card is highest at Store A, at 40% of payments, and lowest at Store C, accounting for only 20% of payments. The percentage of payments made by credit card is 30% at both Stores A and C, and is 40% at Stores B and D. Cheques account for 50% of the payments at Store C, which is higher than at any other store. The other three stores have a similar percentage of payments by cheque, from 27.5% to 35%.

Instructor’s Solutions Manual - Chapter 2

11. Since the samples are different sizes, relative frequencies must be used to make the comparison.

Percentage of Students in Program

Origins of Students in Two College Programs 60% From Local Area

50% 40%

Not From Local Area

30% 20% 10% 0% Business

Technology

Nursing

All three program areas draw a greater percentage of students from outside the local area, although the tendency is strongest for the Business program (about 57% of students not from the local area) and weakest for Nursing (about 51% of students not from the local area).

Instructor’s Solutions Manual - Chapter 2

12. The use of the glass with a swizzle stick does not make the graph more interesting, it just makes it more difficult to read. It is quite difficult to judge the level of operating revenues from the pictures—is it the top of the glass or the top of the swizzle stick that we should read? A bar graph (or a line graph) would be a better choice to display these data, as shown below.

Soft Drink Company, Net Income $7,000

Net Income ($ Millions)

$6,000 $5,000 $4,000 $3,000 $2,000 $1,000 $0 2004

2005

2006

2007

2008

Instructor’s Solutions Manual - Chapter 2

13. Two histograms, properly set up for comparison, are shown below.

Marks for a Random Sample of Students in Ms. Nice's Statistics Class 12

Number of Marks

10 8 6 4 2 0

Final Grade (%)

Marks for a Random Sample of Students in Mr. Mean's Statistics Class 12

Number of Marks

10 8 6 4 2 0

Final Grade (%)

Instructor’s Solutions Manual - Chapter 2

The marks of the students from Mr. Mean’s class are generally higher and less variable than the marks of the students from Ms. Nice’s class. Half of the students from Ms. Nice’s class failed the course, while only two of the students from Mr. Mean’s class failed. 14. The two histograms are shown below.

Daily Pedestrian Traffic at Location 1 14

Number of Days

12 10 8 6 4 2 0

Number of Pedestrians

Daily Pedestrian Traffic at Location 2 14

Number of Days

12 10 8 6 4 2 0

Number of Pedestrians

(Note that the histograms are set up with matching x- and y-axes, and are sized similarly, for ease of comparison. Because the two locations were surveyed for the same number of days, we can compare the numbers directly.)

Instructor’s Solutions Manual - Chapter 2

The histograms clearly show that daily pedestrian traffic is more variable at Location 2 than at Location 1. At Location 1, the daily traffic is in the 75-150 range, while at Location 2, it is in the 45-195 range. Generally, it appears the daily traffic at Location 1 is less than at Location 2. For both locations, the histograms are reasonably symmetric. 15. An appropriate histogram is shown below.

Downtown Automotive, Random Sample of Daily Sales 12

Number of Days

10 8 6 4 2 0 Daily Sales

For Downtown Automotive, daily sales are usually above $1,000, with sales falling into the $1,000 to < $1,500 class on 10 of the 29 days in the sample. The distribution is somewhat skewed to the right, that is, there are a few days when sales are above $2,000. Daily sales range from $690 to $2,878.

Instructor’s Solutions Manual - Chapter 2

16. The appropriate graph is a line graph, such as the one shown below. It covers the 10year period ending in May 2008. You will have more recent data available.

Average Retail Prices of Regular Unleaded Gasoline at Self‐Service Filling Stations in Montreal 160

Cents per Litre

140 120 100 80 60 40 20 Dec‐07

Jun‐07

Jun‐06

Dec‐06

Jun‐05

Dec‐05

Dec‐04

Jun‐04

Dec‐03

Jun‐03

Dec‐02

Jun‐02

Jun‐01

Dec‐01

Dec‐00

Jun‐00

Dec‐99

Jun‐99

Dec‐98

Jun‐98

Over the 10-year period from June 1998 to May 2008, retail gas prices have been rising. The lowest price over the period was 52.2¢ per litre, in February of 1999, and the highest price was $1.36 per litre, in May of 2008. Prices rose fairly rapidly over the end of 1999 and the beginning of 2000, and then stayed fairly steady until June of 2001. At that point, prices fell, from 84.5¢ per litre in May of 2001 to 61.9¢ in November of 2001. They then began to climb again, reaching a high of $1.185 in September of 2005. Retail gas prices in Montreal showed great variability in the range between 88.5¢ per litre to $1.145 per litre through 2006 and 2007, with a sharp increase in April and May of 2008.

Instructor’s Solutions Manual - Chapter 2

17. In this case, while there may be an association between the two variables, the causality link would not be strong. It would not be correct to say that a high mark in Business Math caused a high mark in Statistics, because there is very little overlap between the content of the two courses. However, a student with good study habits and good class attendance might do better in both courses. In this case, while the Business Math mark is not really the explanatory variable, since this course came first, we will put it on the x-axis. A graph of the data is shown below.

Marks from First and Second Year for a Random Sample of Students 120

Statistics Mark (%)

100 80 60 40 20 0 0

100

120

Business Math Mark (%)

There does appear to be a positive relationship between the two marks.

Instructor’s Solutions Manual - Chapter 2

18. A graph of the data is shown below.

Woodbon Furniture Company $140,000 $120,000

Annual Sales

$100,000 $80,000 $60,000 $40,000 $20,000 $0 $0

$1,000

$2,000

$3,000

$4,000

Annual Advertising Expenditure

It appears there is a positive correlation between advertising and sales, that is, when advertising expenditure is higher, annual sales are also higher. 19. (Choosing an appropriate class width for comparison takes some thought. $10,000 is probably too wide (resulting in only 4 classes), and $5,000 is probably too narrow. A class width of $7,500 was used for the two histograms shown on the next page. Because the samples are of different size, relative frequencies should be used for comparison.)

Instructor’s Solutions Manual - Chapter 2

Percentage of Male Customers

Survey of Drugstore Customers, Annual Incomes of Males 50% 45% 40% 35% 30% 25% 20% 15% 10% 5% 0%

Annual Income

Percentage of Female Customers

Survey of Drugstore Customers, Annual Incomes of Females 50% 45% 40% 35% 30% 25% 20% 15% 10% 5% 0%

Annual Income

Annual incomes for female drugstore customers are generally in the $37,500 to < $45,000 class, which accounts for over 48% of female customers' incomes. Some incomes of female customers are higher, but this is unusual in the sample, so the distribution of female customers' incomes is skewed to the right. In contrast, the incomes of male drugstore customers are more variable. Incomes between $30,000 and < $60,000 account for over 86% of male customers' incomes, with incomes spread fairly evenly throughout this range. In general, greater percentages of male customers' incomes are in the higher classes.

Instructor’s Solutions Manual - Chapter 2

Percentage of Male or Female Customers Surveyed

20. The appropriate graph is shown below. Note that the graph shows percentages of males and females, because of the different sample sizes.

Drugstore Customer Survey, Speed of Service Ratings 60% 50%

Percentage of Males

40%

Percentage of Females

30% 20% 10% 0% Excellent

Good

Fair

Poor

Approximately the same small percentage of male and female customers rated the speed of service at the drugstore as excellent (about 5% of male customers and about 7% of female customers). The largest group of female customers (about 55%) rated the speed of service as "fair", and the largest group of male customers (about 57%) rated the speed of service as "good". It appears that male and female customers rate the speed of service very differently at the drugstore.

Instructor’s Solutions Manual - Chapter 2

21. Two histograms are shown below. Note that samples are the same size, so relative frequencies are not required.

Flight Delays Before Airport Upgrades 16 14

Number of Flights

12 10 8 6 4 2 0

Flight Delay in Minutes

Flight Delays After Airport Upgrades 16 14

Number of Flights

12 10 8 6 4 2 0

Flight Delay in Minutes

Instructor’s Solutions Manual - Chapter 2

of 10 - < 20 minutes has been reduced from 7 to 3 after the upgrades, while the number of delays of 20 - < 30 minutes has increased from 13 to 14. The greater number of flights with delays less than 10 minutes indicates some reduction in delays, but results appear mixed.

22. The two graphs are shown below.

Rating by Students of College Experience 350

Number of Students

300 250 200

Excellent

150

Good

100

Fair

Poor

0 Business Studies

Computer Studies

Engineering Technology Studies

Percentage of Students in Program

Rating by Students of College Experience 90% 80% 70% 60% 50% 40% 30% 20% 10% 0%

Excellent Good Fair Poor Business Studies

Computer Studies

Engineering Technology Studies

Instructor’s Solutions Manual - Chapter 2

Because the programs have different numbers of students, the first graph using student numbers distorts the comparison. For example, it appears as if Business Studies and Engineering Technology Studies students choose a rating of "good" equally. However, the second graph reveals that a greater percentage of Engineering Technology Studies students rate their college experience as good. While the relative sizes of the ratings for each individual program remain the same, comparisons across programs are not valid unless relative frequencies are used.

Instructor’s Solutions Manual - Chapter 2

23. The three histograms are shown below. Quarterly Operating Profits 35

Canadian Oil and Gas Extraction and Support Activities, I 1988 to III 2008

Number of Quarters

30 25 20 15 10 5 0 Millions of Dollars

Quarterly Operating Profits Canadian Oil and Gas Extraction and Support Activities, I 1988 to III 2008 45 40

Number of Quarters

35 30 25 20 15 10 5 0

Millions of Dollars

Number of Quarters

Quarterly Operating Profits 50 45 40 35 30 25 20 15 10 5 0

Canadian Oil and Gas Extraction and Support Activities, I 1988 to III 2008

Millions of Dollars

Instructor’s Solutions Manual - Chapter 2

All three histograms show the same general shape, that is, the distribution is rightskewed. In most quarters, operating profits in the oil and gas sector were below $1.5 billion, but there were much higher profits in some quarters. In the first histogram the classes may be too narrow, as there are very low frequencies in many of the classes. However, this histogram provides more information about the many quarters when operating profits were low, as there is a breakdown for below $1 billion, and from $1 billion to < $2 billion. This information is hidden in the histogram with the widest classes. It can be a challenge to decide on appropriate class widths when the distribution is very skewed. In the histogram with the widest classes, a lot of data is contained in the first class (half of the data points are there), and so these classes may be a bit wide. However, any one of these histograms would be acceptable. The particular choice depends on the focus of the analysis. 24. Because you will have more up-to-date data, we cannot provide the histograms for this question. However, you should use the histograms and commentary in the text as guidelines. Be sure to use the same class widths for your comparison, and size your histograms similarly. Choose a class width that works for both data sets (you will probably be able to use $2 billion as the class width, as in the text). Remember that your commentary should simply describe the data sets. Do not get carried away with speculation about why the data look the way they do.

Instructor’s Solutions Manual - Chapter 3

Chapter 3 Solutions Develop Your Skills 3.1 1. Σy = 2 + 4 + 6 + 8 = 20 2.

Σy2 = 22 + 42 + 62 + 82 = 120 (Σy)2 = (2 + 4 + 6 + 8) = 202 = 400 The answers are different because of the different order of operations y 20  5 n 4 x 16  4 n 4

( x  4)  (1  4)  (3  4)  (5  4)  (7  4)  3  (1)  1  3  0 ( y  5)  (2  5)  (4  5)  (6  5)  (8  5)  3  (1)  1  3  0

5. a. b.

Consider the data set: 34, 67, 2, 31, 89, 35. For this data set, calculate: x  34  67  2  31  89  35  258 x 2  34 2  67 2  2 2  312  89 2  35 2  15756 x 258   43 n 6

2  x  x 

2  258  15756 

n 1



15756  11094  5

4662  932 .4  30.535 5

Develop Your Skills 3.2 6. The mean age is 41.2, the median age is 35.5, and the mode of the ages is 30. In this case, because the data set is severely skewed to the right (as we saw when we created the histogram of ages in Develop Your Skills 2.2, Exercise 6), the median is the better measure of central tendency.

The mean income is $47,868.10, and the median income is $44,925. This data set is skewed to the right (as we saw when we created the histogram of incomes in Develop Your Skills 2.2, Exercise 7). As a result the unusually high incomes have pulled the mean to the right of the median. The median is the better measure of central tendency.

Instructor’s Solutions Manual - Chapter 3

From the stem and leaf display we constructed in Develop Your Skills 2.2, Exercise 8, we can see that this data set is slightly skewed to the right, but not much. This is reflected in the calculations of mean and median (when in doubt, calculate both!). The mean of this data set is 26.12, and the median is 26. Either would be acceptable as a measure of central tendency, but the mean is preferred, because its calculation depends on the value of every single data point in the data set.

Because the mean and the median are almost equal, we expect the distribution to be symmetric.

10. Because the quarterly operating profits of the oil and gas sector are highly skewed to the right, the median of $1.816 billion is the appropriate measure of central tendency. Although the distribution of operating profits for the manufacturing sector is not as skewed, so we might have considered using the mean as a measure of central tendency, we must use the median so that we are comparing the same measure for both data sets. The median quarterly operating profit for the manufacturing sector is $8.909 billion. Generally, the quarterly operating profits are much higher for the manufacturing sector than for the oil and gas sector. Develop Your Skills 3.3 11. Since the age data are skewed to the right, the IQR is the best measure of variability. Using Excel calculations, we find: Q1 = 31 Q3 = 42 IQR = 11 The Empirical Rule could not be applied here, as the data are not symmetric and bellshaped.

12. Since the data are skewed to the right, the IQR is the best measure of variability. Using Excel calculations we find: Q1 = $40,350 Q3 = $55,400 IQR = $15,050 13. Since this data set is fairly symmetric with no obvious outliers, the standard deviation is the preferred measure of variability.

s

x 2 

x 2

n 1

653 2 25  18381  17056 .36  1324 .64  55.19333  7.429 24 24 24

18381  

Instructor’s Solutions Manual - Chapter 3

14. Because the distribution is reasonably symmetric and bell-shaped, the Empirical Rule can be applied. You must create a histogram to check this. Shown below is one possible histogram for the data set.

Number of Days

Daily Customer Counts, Downtown Automotive 10 9 8 7 6 5 4 3 2 1 0

Number of Customers

Instructor’s Solutions Manual - Chapter 3

17. The only choice is b (-0.88). Choices a and c are incorrect, because they are positive and the relationship is clearly negative. Choice d is not correct, because the negative relationship is obviously fairly strong. 18. The Spearman rank correlation coefficient must be used here, since the data are ranked. The Spearman r (calculated with Excel) is 0.61. This indicates a positive relationship between the recruiter’s ranking and the supervisor’s ranking, but the relationship is not particularly strong. 19. These are quantitative data, and the graph created for Develop Your Skills 2.5, Exercise 24 shows a linear relationship. The Pearson r is the correct measure of association. Excel calculates it at -0.67 (note that you must check for linearity of the relationship before you calculate the Pearson r). There is a negative relationship between the two variables. The greater the number of hours of paid employment during the semester, the lower the semester average mark. 20. Exhibit 3.44b is the graph that corresponds to the negative correlation coefficient of -0.90. This is obvious, since it is the only graph of the three showing a negative relationship. Exhibits 3.44a and c share the same correlation coefficient of 0.73. This is interesting because the correlation probably “looks” stronger in Exhibit 3.44a. However, notice that these two graphs depict exactly the same data, but with the xand y-axes reversed. Realize that you cannot reliably “eyeball” the strength of a relationship. The correlation coefficient allows us to make much more precise comparisons. Chapter Review Exercises 1. The mean mark is quite a bit higher than the median mark. This suggests that the distribution of marks is skewed to the right. It is likely that there are a few unusually high marks in the distribution.

The mean weekly sales for both businesses are similar, although the mean sales at the haircutting salon are a bit lower than at the day spa. However, the mean sales at the haircutting salon are much less variable than at the day spa. This would result in a greater number of weeks with higher sales for the haircutting salon, and as a result, it would be a better purchase (all other things being equal).

The mean age is 26.05, and the median age is 20.5. This is as expected. Because the distribution of ages is skewed to the right, the mean is greater than the median. There are several modes in the data set: 8, 9, 12, 20. Clearly, the three lower modes are not good indications of central tendency in this data set. The standard deviation is 17.1. Calculation of the interquartile range (manual method) is as follows: The location of Q1 is 5.25, and its value is 12. The location of Q3 is 15.75, and its value is 38, so the IQR is 26.

Because the Pearson r is higher for Don's data set, the correlation between test marks and calories consumed (for Don) will be higher than the correlation between test

Instructor’s Solutions Manual - Chapter 3

marks and hours spent studying (for Jane). However, there is no obvious reason why eating more calories would result in higher test marks. There is a logical connection between hours spent studying and test marks, so this cause and effect relationship is stronger. 5.

First, remember that with sample data, we cannot absolutely prove anything. As well, although the correlation coefficient is low, this does not mean that there is no relationship between incomes and purchases. As discussed in the answer to Develop Your Skills Exercise 23 in Chapter 2, the lack of relationship between an individual purchase and annual income does not preclude the existence of a relationship between annual purchases and annual income.

Both histograms showed some right-skewness, particularly the purchases by females. However, the mean and median purchases for both groups are similar. The mean purchase by females is $30.86, and the median purchase is $29.50. The mean purchase for males is $28.90, and the median purchase is $28.38. Because the means and medians are so close, we will use the mean as the measure of central tendency. On average, the purchases of males are slightly higher than the purchases of females. Because we used the mean for the measure of central tendency, we will use the standard deviation as the measure of variability. The standard deviation for purchases by females is $10.97, while the standard deviation for purchases by males is $5.39. As we saw in the histograms we created in Chapter 2 (Chapter Review Exercise 6), there is less variability in purchases by males than purchases by females.

Because a histogram of the data is symmetric and bell-shaped, we can apply the Empirical Rule. If the sample is representative of the population, then we can expect that about 68% of the data lie within one standard deviation of the mean, that is, between 170 cm and 184.4 cm, with 32% divided between the two tails of the distribution. This means that about 16% of young men aged 18-24 would be shorter than 170 cm. Almost all of the heights would be within three standard deviations of the mean, that is, between 155.6 cm and 198.8 cm. Therefore, there would not be many young men aged 18-24 who were taller than 199 cm.

This is a small data set, and so it is not possible to create a histogram to assess the shape of each location’s sales distribution. However, if you order the two data sets, in both cases, there are more observations on the high end of the range than elsewhere, suggesting some skewness. Therefore the interquartile range is probably the best measure of variability. If you do the calculations by hand, the results are as follows. Both data sets have 7 data points. Q1 location is the 0.25(n+1) = 0.25(8) = 2nd place Q3 location is the 0.75(8) = 6th place

Instructor’s Solutions Manual - Chapter 3

Red Deer Vernon Q1 109.55 112.30 Q3 122.48 122.01 IQR 12.93 9.71 The Red Deer location’s sales are more variable than the Vernon location’s sales over the period. If you do the calculations with Excel, the numerical results are different but the conclusion is the same.

Q1 Q3 IQR 9.

Red Deer Vernon 112.23 114.795 122.42 121.675 10.19 6.88

The mean price of the inkjet printer cartridges is $26.93. The median price is $25.95.

10. The mean weight of the honey in the jars is 497.3 grams. While this is below 500 grams, it is not much below. Without knowing more about the variability of the weights of honey in the jars, we cannot make a conclusion about whether the jars are being consistently underfilled. When you master the techniques of Chapter 7, you will be able to decide.

Instructor’s Solutions Manual - Chapter 3

11. This data set is reasonably symmetric and bell-shaped. The mean weekly sales for the sample of stores trying out the new marketing approach are $5101.07, with a standard deviation of $325.60. Applying the Empirical Rule, almost all of the sales would be between $4124.28 and $6077.85.

Weekly Sales for Stores with New Marketing Approach

Number of Stores

5 4 3 2 1 0

Weekly Sales

12. First we must assess the shape of the distribution. The histogram below shows a reasonably symmetric and bell-shaped data set.

Annual Days Off (Other Than Vacation) for a Random Sample of Employees

Number of Employees

12 10 8 6 4 2 0 Days Off

Instructor’s Solutions Manual - Chapter 3

The mean days off is 6.04, with a standard deviation of 1.29. The sample mean is below the average days off in the past, so there is a reason to hope that there has been an improvement. However, we need to do a formal hypothesis test (covered in Chapter 7) to determine whether there is sufficient evidence to conclude that average days off among all employees has actually decreased. As well, even if we conclude that there has been a decrease in average days off, we cannot necessarily conclude that the wellness program is the cause. Other factors may account for the difference, such as a change in the workforce. 13. Since there are many tied values, it is a bit of a challenge to do the ranking process. The results are as follows. Ratings by Customers Rank Ratings by Bosses Rank 2 2 3 4 3 3 2 2 1 2

4 4 8 10 8 8 4 4 1 4

3 4 2 1 2 2 4 1 3 2

7.5 9.5 4.5 1.5 4.5 4.5 9.5 1.5 7.5 4.5

The Spearman rank correlation coefficient is -0.574. This indicates a negative correlation between the ratings by customers and the ratings by bosses, that is, the ratings tend to be higher by customers when the ratings by bosses are lower. However, the correlation coefficient indicates that this relationship is weak. 14. Since the data sets are fairly symmetric, the mean and the standard deviation are the appropriate measures of central tendency and variability. The results are shown below. Location 1 Location 2 Mean 108.4 124.4 Standard Deviation 17.3 29.6 Mean daily pedestrian traffic is higher at Location 2, at 124.4, compared with 108.4 at Location 1. The daily pedestrian traffic at Location 2 (standard deviation is 29.6) is also more variable than at Location 1 (standard deviation of 17.3).

Instructor’s Solutions Manual - Chapter 3

15. Because the data are quantitative and appear to be linearly related, the Pearson r is the appropriate measure of association. The Pearson r is 0.958, indicating a high correlation between the mark in Business Math and in Statistics. 16. Because the data are quantitative and appear to be linearly related, the Pearson r is the appropriate measure of association. The Pearson r is 0.941, indicating a high correlation between annual advertising expenditure and annual sales. 17. The customer incomes are skewed to the right, with a few incomes much higher than the rest in the data set. Therefore, the median and the interquartile range are the appropriate measures. The median income of the drugstore customers is $44,925. The interquartile range is 15,050 (Excel) or 15,762.5 (by hand).

Instructor’s Solutions Manual - Chapter 3

18. Before we can decide on appropriate numerical measures to compare the data sets, we must examine the shapes of the distributions, by creating histograms, as shown below. Note that these histograms are not appropriate for comparison of the distributions—they are just for deciding on the appropriate measures.

Kate's Clients' RRSP Holdings 60

Number of Clients

50 40 30 20 10 0

RRSP Holding

Wally's Clients' RRSP Holdings 40 35

Number of Clients

30 25 20 15 10 5 0

RRSP Holding

Instructor’s Solutions Manual - Chapter 3

Since both data sets are reasonably symmetric, we can use the mean and the standard deviation to compare them. The results are shown in the table below. Kate’s Clients’ Wally’s Clients’ RRSP Holdings RRSP Holdings Mean $111,021.86 $101,092.89 Standard Deviation $ 32,050.79 $ 40,192.47 The mean holdings of Kate’s clients’ RRSPs are higher, at $111,021.86, than the mean holdings of Wally’s clients RRSPs, at $101,092.89. The variability of the RRSP holdings of Kate’s clients is less than for Wally’s clients (standard deviation of $32,050.79, compared with $40,192.47). 19. First, we must examine the shape of the distribution. A histogram shows a reasonably symmetric data set (see below).

Contents of a Sample of Soup Cans 14 12 Number of Cans

10 8 6 4 2 0

Contents in Millilitres

The mean measurement is 540.4 mL, with a standard deviation of 5.17 mL. The maximum measurement in the sample is 551 mL, and this does not give any cause for concern that the cans contain more than 556 mL. As well, if we apply the Empirical Rule, we note that almost all of the measurements would be between 524.9 mL and 555.9 mL, which again does not give any cause for concern that the cans contain more than 556 mL. A measurement of 530 mL is about two standard deviations below the mean. The Empirical Rule says that about 95% of the data will lie within two standard deviations of the mean, with the remaining 5% split between the two tails of the distribution. If this can be applied to the population data, then about 2½ % of the cans would contain less than 530 mL.

Instructor’s Solutions Manual - Chapter 3

20. Once again, we must check the distribution of the data set to see if the Empirical Rule applies.

Contents of a Sample of Soup Cans 12

Number of Cans

10 8 6 4 2 0

Contents in Millilitres

Since the distribution is approximately bell-shaped and symmetric, we can apply the Empirical Rule. The mean measurement is 543.63 mL, with a standard deviation of 6.44 mL. There is one can of soup in the sample that contains more than 556 mL. Applying the Empirical Rule, we note that 95% of the soup cans would contain between 530.7 mL and 556.5 mL. This leaves 2½% of the soup cans with more than 556 mL, and 2½% of the soup cans with less than 530 mL. 21. Once again, the answers will depend on the most up-to-date data available when you are answering this question. (As a guide, the retail sector data that matches the data in the text for the manufacturing sector are discussed. Because of revisions, the more recent data sets may not exactly match these data. For example, the retail series was significantly changed between the time of the original download in May of 2009, and a subsequent download in November 2009. It is a challenge to come up with a class width for comparison for the two sectors, since quarterly operating profits are much smaller for the retail sector than for the manufacturing sector. The compromise choice of $1 billion is really too narrow for the manufacturing data, and not wide enough for the retail sector data. However, these histograms give a starting point for the analysis.)

Instructor’s Solutions Manual - Chapter 3

Quarterly Operating Profits Canadian Retail Sector, I 1988 to III 2008 45 40 Number of Quarters

35 30 25 20 15 10 5 0

Millions of Dollars

Quarterly Operating Profits Canadian Manufacturing, I 1988 to III 2008

Number of Quarters

12 10 8 6 4 2 0

Millions of Dollars

The histograms show that quarterly operating profits are smaller for the Canadian retail sector than for the manufacturing sector. For over half the period, quarterly operating profits for the retail sector were < $2 billion. In over 60% of the quarters in the period under study, the quarterly operating profits of the Canadian manufacturing sector were $8 billion or more. The distributions of quarterly operating profits also differ. The distribution for the retail sector profits is skewed to the right, with profits above $3 billion in a few quarters. The distribution for the manufacturing sector profits is skewed to the left, with a few quarters where operating profits were unusually low (below $5 billion).

Instructor’s Solutions Manual - Chapter 3

Median quarterly operating profits for the manufacturing sector were $8.909 billion, much greater than the median quarterly operating profits for the retail sector, at $1.827 billion. Quarterly operating profits for the manufacturing sector were much more variable over the period, with an interquartile range of $4.648 billion, compared with only $1.1 billion for the retail sector.

Quarterly Operating Profits of the Manufacturing Sector ($ Millions)

There appears to be some slight positive correlation between the quarterly operating profits of the two sectors, but it is not strong. The scatter diagram below illustrates.

Quarterly Operating Profits for Two Canadian Sectors, I 1988 to III 2008 $16,000 $14,000 $12,000 $10,000 $8,000 $6,000 $4,000 $2,000 $0 $0

$1,000

$2,000

$3,000

$4,000

$5,000

Quarterly Operating Profits of the Retail Sector ($ Millions)

The Pearson r is 0.42, confirming the impression from the scatter diagram of a weak positive relationship. When quarterly operating profits of the retail sector are higher, the quarterly operating profits of the manufacturing sector tend to be higher, but the correlation is weak. (Your comparison, with more up-to-date data, should contain all of the elements shown in this answer.)

Instructor’s Solutions Manual - Chapter 4

Chapter 4 Solutions Develop Your Skills 4.1 1a. Sample space: 246 employees commute more than 40 km by car 350-246=104 commute  40 km by car P(randomly selected employee commutes > 40 km by car) = 246/350 = 0.7029 1b. Sample space: 150 employees arrange rides with others 350-150=200 ride alone P(randomly selected employee arranges rides with others) = 150/350 = 0.4286 Note that the sample space need not be more complicated than necessary. A full description of the sample space, for both worker characteristics (commuting distance and arranging rides) could look as follows. Commuting Characteristics of Car Part Manufacturing Plant Arrange Rides With Others Ride Alone Totals Commute > 40 km 246-135=111 135 246 104-65=39 200-135=65 350-246=104 Commute  40 km 150 350-150=200 350 Totals 2.

Sample space: We are interested only in managers, so the sample space is as follows: Only High-School Education: 0 Grad Degree Or Post-Grad Studies: 10 up To 4 Years Of Post-Secondary Education: 37 – 10 – 0 = 27 P(randomly selected manager has up to 4 years of post-secondary education)=27/37=0.7297

Sample space: Professional Employees: 372 Managers: 37 (Note that "managerial" and "professional" are separate job classifications, so managers are not included in the count of professional workers.) Clerical Employees: 520-372-37 = 111 P(randomly selected employee is professional) = 372/520 = 0.7154 P(randomly selected employee is clerical) = 111/520 = 0.2135

Instructor’s Solutions Manual - Chapter 4

Sample space: Shoppers Doing A Quick Trip: 62 Shoppers Doing A Major Stock-Up: 13 Shoppers Doing A Fill-In Shop: 100-62-13=25 P(randomly selected customer is doing a fill-in shop) = 25/100 = 0.25

Sample space: Loved Previous Math Courses: 56 Worked Very Hard In Previous Math Courses, But Did Not Enjoy Them, Or Thought Previous Math Courses Were Far Too Difficult, Or Equated Previous Math Courses With Sticking Needles In The Eyes: 225-56=169 P(randomly selected student did not love his/her previous math courses)=169/225=0.7511

Develop Your Skills 4.2 6. These probabilities are given: P(pays with credit card) =P(CC) = 0.80 P(buys something other than gas) P(OTG) = 0.25 P(pays with credit card and buys something other than gas) =P(CC & OTG)= 0.20 Want to know P(pays with credit card GIVEN buys something other than gas) = P(CC  OTG) P(CC & OTG) 0.20    0.80 P(OTG) 0.25

Also want to know P(buys something other than gas GIVEN pays with a credit card) =P(OTG  CC) P(CC & OTG) 0.20    0.25 P(CC) 0.80

To check for independence, compare P(CC  OTG) with P(CC). The two probabilities are equal. So, paying with a credit card and buying something other than gas are independent (not related).

Instructor’s Solutions Manual - Chapter 4

P(honey-nut flavour GIVEN family size box) P(honeynut & family size) 180    0.5714 P(family size) 315 To check for independence, compare this with 315  180 = 0.55 P(honey-nut flavour)  900 Since the two probabilities are NOT equal, the events are NOT independent, that is, the size of the box and the flavour are related, in terms of sales.

P(female given employed in agriculture industry) 

96.5 96.5   0.2951 (96.5  230.5) 327

P(employed in public administration, given male) 

454  0.0503 9,021.3

P(employed in public administration) 

454  471.7 925.7   0.0541 (9,021.3  8,104.5) 17,125.8

Since P(employed in public administration, given male) ≠ P(employed in public administration), we can say that sex of the employee and industry of employment were not independent in Canada in 2008. However, the difference in probabilities is not that great (both are around 5%). However, we can see that P(employed in manufacturing) 

1,410.4  559.9 1,970.3   0.1150 (9,021.3  8104.5) 17,125.8

which is not equal to P(employed in manufacturing given male) 1,410.4   0.1563 9,021.3 This is more convincing evidence that sex of the employee was not independent of the industry of employment in Canada in 2008.

Instructor’s Solutions Manual - Chapter 4

It is easiest to proceed if we first compute row and column totals for the table.

Accounts Receivable for a Roofing Company amount age $5,000 $5,000 - <$10,000 $10,000 total < 30 days 12 15 10 37 30 - <60 days 7 11 2 20 60 days and over 3 4 1 8 total 22 30 13 65

For the accounts receivable at this roofing company: P(< 30 days ) = 37/65 = 0.5692 P( < 30 days  $5000) = 12/22 = 0.5455 Since the two probabilities are not equal, account age and amount are not independent. 10. First calculate the row and column totals to make the probability calculations easier. Customer Survey for a Dry Cleaning Company Service Will Use Will Not Use Total Rating Services Again Services Again Poor 174 986 1,160 Fair 232 928 1,160 Good 2,436 174 2,610 Excellent 754 116 870 Total 3,596 2,204 5,800 P(customer will use dry cleaning company’s services again) 3596   0.62 5800 P(customer will not use the dry cleaning company’s services again, given a rating of “good” or “excellent”) 174  116   0.0833 2610  870 This is interesting, because about 8% of customers are not planning on using the dry cleaning company’s services again, even though they rate the service as good or excellent. Clearly something other than the service is keeping these customers away.

Instructor’s Solutions Manual - Chapter 4

To test for independence, we could compare P(customer will not use the dry cleaning company’s services again) with the conditional probability we just calculated above. P(customer will not use the dry cleaning company’s services again) =1 – P(customers will use dry cleaning company’s services again) =1 – 0.620 = 0.38  0.0833 Since these probabilities are not equal, the events are not independent. There is a relationship between the rating of the service and the tendency to use it again. Overall, 38% of customers don’t plan to use the service again. However, only about 8% of those who rated the service positively don’t plan to use it again. Develop Your Skills 4.3 11. We are told that the two employees live in different parts of the city, and so presumably could not be held up by the same traffic problems. Assume that each employee’s lateness is independent of the other’s lateness. Then P(both are late) =P(Jane is late and Oscar is late) =0.02 • 0.04 = 0.0008 The probability is low that both Jane and Oscar will be late for work.

12. We are told the friends are very different, and will assume that any one of them getting a job in the financial services industry is independent of the others getting such a job. Label the friends “1”, “2” and “3”. a.

P(all three of them succeed) =P(1 does and 2 does and 3 does) =0.4 • 0.5 • 0.35 = 0.07

P(none of them succeeds) =P(1 does not and 2 does not and 3 does not) =(1-0.4)•(1-0.5)•(1-0.35) =0.6•0.5•0.65=0.195

P(at least one of them succeeds) =1-P(none of them succeeds) =1-0.195= 0.805 Using the complement rule here is a life-saver!

Instructor’s Solutions Manual - Chapter 4

13. P(game aimed at 15-25 year olds succeeding) = 0.34 P(accounting program for small business succeeding) = 0.12 P(payroll system for government organizations succeeding) = 0.10 We are told to assume that the events are independent. P(all three succeed) = 0.34 • 0.12 • 0.10 = 0.00408 For calculation of at least two out of three succeeding, we need to think about what this means, in terms of the sample space. At least two out of three succeeding means exactly two out of the three succeeding, or all three succeeding. A tree diagram might be helpful to picture this. We need to calculate and add the probabilities for the cases shown in bolded letters on the right-hand side of the tree diagram.

0.12

0.88

SSS

0.90

SSF

0.10

SFS

SFF F 0.90 accounting payroll program system FSS 0.10 S

game

0.66

S 0.34

0.10

0.12

S 0.90

FSF

0.10

FFS

0.90

FFF

F 0.88

Instructor’s Solutions Manual - Chapter 4

14. A fully-labelled tree diagram for the GeorgeConn customer data is shown below.

P(R and S)=0.3

P(NR)=1/4

P(R and N)=0.1

P(SU)=4/6

P(U and S)=0.4

P(U and N)=0.2

P(S and R)=0.3

P(US)=4/7

P(S and U)=0.4

P(RN)=1/3

P(N and R)=0.1

P(N and U)=0.2

P(SR)=3/4 R

P(R)=4/10

P(U)=6/10

P(NU)=2/6

Another way to set up the tree diagram:

P(RS)=3/7 S

P(S)=7/10

P(N)=3/10

P(UN)=2/3

Instructor’s Solutions Manual - Chapter 4

15. P(hourly worker or only high school education) =P(hourly worker) + P(only high school education) – P(hourly worker and only high school education) = (790+265+2)/1345 + (790+7+1+0)/1345 – 790/1345 = (790 + 265 + 2 + 7 + 1)/1345 = 1065/1345 = 0.7918 Chapter Review Exercises

P(account paid early) 119   0.1587 750 P(account paid on time) 320   0.4267 750 P(account paid late) 200   0.2667 750 P(account uncollectible) 111   0.1480 750

The probability calculations may seem easier if you organize the information into a table, as follows.

Business Diploma No Business Diploma Total Men Women Total

30 25 55

P(employee is a man) 60   0.60 100

P(employee is a man with a Business diploma)

30 15 45

60 40 100

Instructor’s Solutions Manual - Chapter 4



30  0.30 100

P(employee is a woman) 40   0.40 100

P(employee is a woman with a Business diploma) 25   0.25 100

P(employee has a Business diploma) 55   0.55 100

P(employee is a man without a Business diploma) 30   0.30 100

P(employee is without a Business diploma) 45   0.45 100

P(employee is a woman without a Business diploma) 15   0.15 100

P(employee is a woman or employee with a Business diploma) 40 55 25     0.70 100 100 100

P(employee is a man or employee with a Business diploma) 60 55 30     0.85 100 100 100

P(employee is a woman or employee without a Business diploma) 40 45 15     0.70 100 100 100

P(employee is a man or employee without a Business diploma) 60 45 30     0.75 100 100 100

P(employee has Business diploma given she is a woman)

Instructor’s Solutions Manual - Chapter 4



25  0.625 40

To test to see if gender and possession of a Business diploma are related for GeorgeConn employees, we can compare the probability above to P(employee has a Business diploma) 

55  0.55 100

P(employee has Business diploma given she is a woman)) ≠ P(employee has a Business diploma), so gender and possession of a Business diploma are related for GeorgeConn employees. Female employees are more likely to have a Business diploma. 4.

P(caller directly connected) = 0.80 P(caller forced to wait) = 0.20 P(caller connected directly for three different calls) = P(caller connected on first day AND connected on second day AND connected on third day) = 0.8 • 0.8• 0.8 = 0.512 P(caller forced to wait for three different calls) = P(caller forced to wait on first day AND forced to wait on second day AND forced to wait on third day) = 0.2 • 0.2• 0.2 = 0.008

Start by totalling the rows and columns of the table. This will speed up the probability calculations.

P(primary skill is bookkeeping) 30   0.30 100

P(employee has less than one year of experience) 50   0.50 100

P(primary skill is reception)

Instructor’s Solutions Manual - Chapter 4



25  0.25 100

P(employee has one to two years of experience) 23   0.23 100

P(primary skill is document management) 45   0.45 100

P(employee has more than two years of experience) 27   0.27 100

Begin by totalling rows and columns. Survey of Restaurant Customers Opinion About Food Satisfied with Service Not Satisfied with Service Totals Excellent 0.36 0.06 0.42 Good 0.18 0.07 0.25 Fair 0.10 0.08 0.18 Poor 0.05 0.10 0.15 Totals 0.69 0.31

P(customer is satisfied with service and rates the food as poor) = 0.05

P(customer is not satisfied with service) = 0.31

P(not satisfied with service given food rated as poor) 0.10   0.6667 0.15

To test for independence, we could compare the probability in part c with P(not satisfied with service) =0.31 The two probabilities are not equal, so the service rating and the food rating are related (that is, NOT independent). People who rate the food as poor are more likely to be dissatisfied with the service.

Instructor’s Solutions Manual - Chapter 4

P(salesperson will exceed targets two years in a row) = P(salesperson exceeds target this year and exceeds target next year) = P(exceeds target this year) • P(exceeds target next year  exceeds target this year) = 0.78 • 0.15 = 0.0117

Begin by summing rows and columns of the table. Follow-up Survey of Customers Who Bought Netbook Computers Satisfied Not Satisfied Totals 1 GB of RAM or Less 0.30 0.05 0.35 More than 1 GB of RAM 0.50 0.15 0.65 Totals 0.80 0.20

P(satisfied with his/her purchase) = 0.30 + 0.50 = 0.80

P(satisfied with his/her purchase more than 1 GB of RAM) = 0.50/(0.50 + 0.15) = 0.7692

The amount of RAM affects whether or not the purchaser was satisfied with his/her purchase. P(satisfied  more than 1 GB of RAM) = 0.7692 ≠ P(satisfied) = 0.80. Customers who bought netbooks with more than 1 GB of RAM were less likely to be satisfied with their purchases. This may seem odd, because generally more RAM means a better computer. However, it is possible that customers buying machines with more RAM had higher performance expectations that could not be met with slower processors.

Instructor’s Solutions Manual - Chapter 4

A tree diagram is helpful: P(P)=0.75

P(P)=0.75

P(F)=0.25

P(PF)=0.90

P(FF)=0.10

P(F and P) =0.25 • 0.90 =0.225 P(F and F) =0.25 • 0.10 =0.025

P(pass with no more than two attempts) = P(pass the first time) + P(fail the first time and pass the second time) = 0.75 + 0.225 = 0.975 10. Again, begin by summing rows and columns in the table. Customers of an Insurance Company Single Married Divorced Totals Male 25 125 30 180 Female 50 50 20 120 Totals 75 175 50 a.

P(female or married) = P(female) + P(married) – P( female and married) = (50+50+20)/300 + (125 + 50)/300 – 50/300 = (50 + 50 + 20 + 125 + 50 - 50)/300 = 245/300 = 0.8167

P(married  male) = P(married and male)/P(male) = (125/300) / (25 + 125 + 30)/300 = 125/ (180) =0.6944

P(married) = (125 + 50)/300 = 0.5833 ≠ P(married  male) = 0.6944 Since the probabilities are not equal, gender and marital status are not independent.

Instructor’s Solutions Manual - Chapter 4

11. R: market rises RC: market does not rise P: newsletter predicts rise PC: newsletter predicts market will not rise

P(PR)=0.70

P(R and P) =0.60 • 0.70 =0.42

P(R)=0.60

P(R )=0.40

P(R and PC) =0.60 • 0.30 =0.18

P(PCR)=0.30

P(PRC)=0.30

P(RC and P) =0.40 • 0.30 =0.12

P(RC and PC) =0.40 • 0.70 =0.28

RC P(PCRC)=0.70

P(correct prediction) = P(market rises and newsletter predicts rise) + P(market does not rise and newsletter predicts market will not rise) = 0.42 + 0.28 (from tree diagram) = 0.70 12. P(all of the students selected by the company are female) = 6/10 • 5/9 • 4/8 = 120/720 = 0.1667 P(all of the students selected by the company are male) = 4/10 • 3/9 • 2/8 = 24/720 = 0.0333

Instructor’s Solutions Manual - Chapter 4

13. If we can identify one situation where gender and tendency to use the health facilities are related, we can say that gender and the tendency to use health facilities are related. Compare P(used the facilities) with P(used the facilities  male) P(used the facilities) = 210/350 = 0.6 P(used the facilities  male) = 65/170 = 0.3824 Since these two probabilities are not equal, gender and tendency to use the health and fitness facilities are not independent (that is, they are related). 14.

P(MU)=65/210

P(U and M) =65/350 =0.1857

P(U)= 210/350 P(FU)=145/210

P(MD)=105/140

P(D and M) =105/350 =0.3

P(D and F) =35/350 =0.1

P(D)= 140/350 D

P(FD)=35/140

P(U and F) =145/350 =0.4143

The joint probabilities are the same, as we would expect them to be. 15. One of the ways to test for independence (or lack of it) is as follows. P(purchased the product) 228   0.76 300 P(purchased the product given saw the TV ad) 152 152   = 0.8085 152  36 188

Instructor’s Solutions Manual - Chapter 4

These two probabilities are not equal, so purchasing behaviour is related to seeing the TV ad. Those who saw the ad were more likely to purchase the product. 16.

A: Brenda moves to Alberta AC: Brenda does not move to Alberta B: Brenda is offered the job at Canada’s largest bank BC: Brenda is not offered the job at Canada’s largest bank

P(AB)=0.7

P(B and A) =0.25 • 0.7 =0.175

P(B)=0.25

P(ACB)=0.3

P(ABC)=0.35

P(BC)=0.75 BC

P(ACBC)=0.65

P(B and AC) =0.25 • 0.3 =0.075 P(BC and A) =0.75 • 0.35 =0.2625 P(BC and AC) =0.75 • 0.65 =0.4875

From the tree diagram: P(Brenda will be offered the job and not move to Alberta) = 0.075

Instructor’s Solutions Manual - Chapter 4

17. I: Canadian adult has taken instruction in canoeing IC: Canadian adult has not taken instruction in canoeing CT: Canadian adult is going on a canoe trip this summer CTC: Canadian adult is not going on a canoe trip this summer

P(CT I)=0.46

P(I and CT) =0.03 • 0.46 =0.0138

P(I)=0.03

P(I and CTC) C =0.03 • 0.54 P(CTC I)=0.54 CT =0.0162 P(CT IC)=0.20

P(IC)=0.97

P(IC and CT) =0.97 • 0.20 =0.194

CTC

P(IC and CTC) =0.97 • 0.80 =0.776

IC P(CTC IC)=0.80

18. P(a randomly-selected Canadian adult is going on a canoe trip this summer, and has taken some canoeing instruction) = 0.0138 P(a randomly-selected Canadian adult is going on a canoe trip this summer, and has not taken any canoeing instruction) = 0.194 19. P(a randomly-selected customer from one of these stores uses a cash/debit card or a credit card for payment) = (150 + 180)/500 = 0.66 20. If we can identify one situation where payment method and store location are related, we can say that they are related in general. One approach is to compare P(cheque) with P(cheque  Store A) P(cheque) = 170/500 = 0.34 P(cheque  Store A) = 30/100 = 0.30 Since these two probabilities are not equal, payment method and store location are not independent.

Instructor’s Solutions Manual - Chapter 4

21. For people who visit the facility: P(buy a membership) = 0.40 P(buy a membership and sign up for fitness classes) = 0.30 P(fitness classes  bought a membership) =0.30/0.40 = 0.75 22. Buying a membership and signing up for fitness classes are NOT mutually exclusive. We are told that P(buy a membership and sign up for fitness classes) = 0.30 ≠ 0. A person can do both, so the events are not mutually exclusive. We cannot assess independence without more information. For example if we knew P(fitness classes), we could compare that with P(fitness classes  bought a membership). 23. If we can identify one situation where gender and type of alcoholic drink are related, we can say that they are related in general. However, there is no case where gender and type of alcoholic drink are related. P(wine) = (36 + 54)/(42 + 63 + 36 + 54 + 22 + 33)=90/250=0.36 P(wine  female) = 54/(63 + 54 + 33) = 0.36 P(wine  male) = 36/(42 + 36 + 22) = 0.36 So we can see that P(wine) = P(wine  female) = P(wine  male). Similarly, P(beer) = P(beer  female) = P(beer  male). As well, P(other alcoholic drinks) = P(other alcoholic drinks  female) = P(other alcoholic drinks  male). We cannot identify a situation where gender and type of alcoholic drink are not independent, so we conclude that gender and type of alcoholic drink ordered are independent in this sample. 24. 500 circuit boards, of which 30 are defective, 470 are not defective P(all three are defective) =30/500 • 29/499 • 28/498 =24,360/124,251,000 = 0.000196054 P(defective board on 1st selection) = 30/500 = 0.06 P(all 3 boards defective, assuming independence) = 0.06 • 0.06 • 0.06 = 0.000216 The probabilities agree, to four decimal places. 25. In this case, we are stuck. We have only one probability (25%), but we do not have independent events. Once the first randomly-selected Canadian is asked about RRSP plans, he/she is removed from further consideration. Depending whether this person plans to make an RRSP contribution over the next year, this will affect the 25% probability of making a contribution. However, it will not affect it very much,

Instructor’s Solutions Manual - Chapter 4

because there are many millions of Canadians. So, although the events are not really independent, we can still use the probability as if they were. P(all four intend to contribute to their RRSPs over the next year) = 0.25 • 0.25 • 0.25 • 0.25 = 0.0039 26. This looks like a long and complicated question, but it isn't, as long as the information is organized properly. You can use the Sort tool (under the Data tab) in Excel to help organize the data as you require it (the use of the Sort tool was described in Chapter 1). You might also explore the use of the Filter tool (use Excel's Help function if you can't see how it works.) There are 24 employees in total. a.

P(employee has low experience) 13   0.5417 24 P(employee has high experience) 11   0.4583 24 P(employee has a specialty in spreadsheet software) 11   0.4583 24 P(employee has a specialty in presentation software) 4   0.1667 24

There are 11 employees who specialize in spreadsheet software. P(first employee selected has high experience given specialization in spreadsheet software) 4   0.3636 11 P(second employee selected has high experience given specialization in spreadsheet software given first employee had high experience given specialization in spreadsheet software) 3   0.30 10 P(both employees selected have high experience, given specialization in spreadsheet software) 4 3 12    =0.1091 11 10 110

Instructor’s Solutions Manual - Chapter 4

P(at least one of the two employees has high experience) = 1 – P(none of the employees has high experience) 42 68 7 6  1     1   0.6182 110 110  11 10  c.

The joint probability table is shown below.

Database Software Low High Totals

Presentation Software 0 1 1

Spreadsheet Software

Word Processing Software

4 0 4

7 4 11

Totals 2 6 8

13 11 24

P(high experience, given specialty is word processing) 6   0.75 8 P(low experience, given specialty is spreadsheet) 7   0.6364 11 d.

The joint probability table is shown below.

Database Software F M Totals

Presentation Software 0 1 1

Spreadsheet Software 3 1 4

5 6 11

Word Processing Software

Totals

4 4 8

12 12 24

P(word processing specialization given female) 4   0.3333 12

Instructor’s Solutions Manual - Chapter 4

P(male given spreadsheet specialization) 6   0.5455 11 e.

Because the tree diagram has three stages, it takes up an entire page (see the next page). Notice that the end-stage probabilities could have been calculated directly from the table of information about the employees. It is useful to explore the structure of the sample space with the tree diagram. M: Male F: Female D: Specializes in Database Software P: Specializes in Presentation Software S: Specializes in Spreadsheet Software W: Specializes in Word Processing Software L: Low Experience H: High Experience

Instructor’s Solutions Manual - Chapter 4

0/1 D

1/1

1/24

3/24

1/24

3/24

4/24

1/24

3/24

1/12 1/1 P

1/12 M

6/12

3/6 S

12/24

0/1

3/6

4/12 1/4 W

3/4

0/0 D

0/0

0/12 12/24

3/3 3/12 F

5/12

0/3

4/5 S

1/5

4/12 1/4 W

3/4

Instructor’s Solutions Manual - Chapter 4

This tree diagram is just another way of representing the sample space. The endstage probabilities match those in part e.

1/24

0/24

3/24

0/24

1/24

3/24

0/24

1/24

3/24

1/24

0/24

3/24

4/24

1/24

1/7 0/7 3/7

M 7/11

3/7 H 4/11

11/24

0/4 0/4 F

1/4 3/4

0/5 1/5 13/24

3/5

5/13 1/5

L 8/13

0/8 3/8 F

4/8 1/8

Instructor’s Solutions Manual - Chapter 5

Chapter 5 Solutions Develop Your Skills 5.1 1. a. Discrete. The number of passengers on a flight from Toronto to Paris is a count. b. Continuous. The time it takes you to drive to work in the morning could take on any one of an infinite number of possible values, within some range (shortest possible trip to longest possible trip). c. Discrete. The number of cars who arrive at the local car dealership for an express oil change service on Wednesday is a count. d. Continuous. The time it takes to cut a customer’s lawn could take on any one of an infinite number of possible values, within some range (smallest easiest lawn to largest most difficult lawn). e. Discrete. The number of soft drinks a student buys during one week is a count. f. Continuous. The kilometres driven on one tank of gas could take on any one of an infinite number of possible values, within some range (shortest possible distance to longest possible distance). 2. x 0 1 2 3 P(x) 0.195 0.43 0.305 0.07 P(x=3)=0.07 from previous calculations P(x=0)=0.195 from previous calculations P(x=1) =P(1 does, 2 does not, 3 does not)+P(1 does not, 2 does, 3 does not)+P(1 does not, 2 does not, 3 does) =(0.4 • 0.5 • 0.65) + (0.6 • 0.5 • 0.65) + (0.6 • 0.5 • 0.35) =0.13 + 0.195 + 0.105 = 0.43 P(x=2) = 1 – P(x = 1 or 1 or 3) = 1 – 0.195 – 0.43 – 0.07 = 0.305

Instructor’s Solutions Manual - Chapter 5

0.12

S 0.88

0.34 game

0.66

0.12

P(SSS)=0.00408

0.90

P(SSF)=0.03672

0.10

P(SFS)=0.02992

F P(SFF)=0.26928 0.90 accounting payroll program system 0.10 S P(FSS)=0.00792 S F P(FSF)=0.07128 0.90

F 0.88

0.10

P(FFS)=0.05808

0.90

P(FFF)=0.52272

P(x=0) = 0.52272 P(x=3) = 0.00408 P(x=1) = 0.26928 + 0.07128 + 0.05808 = 0.39864 P(x=2) = 1- 0.52272 - 0.00408 - 0.39864 = 0.07456 x 0 1 2 3 P(x) 0.52272 0.39864 0.07456 0.00408 The expected number of successes is  = 0(0.52272) + 1(0.39864) + 2(0.07456) + 3(0.00408) = 0.56

Instructor’s Solutions Manual - Chapter 5

D: defective circuit board found OK: circuit board not defective

P(D and D) =(30 • 29)/(5000 • 4999)

P(OK D) =4970/4999

P(D and OK) =(30 • 4970)/(5000 • 4999)

P(D OK) =30/4999

P(OK and D) =(4970 • 30)/(5000 • 4999)

P(D D) =29/4999

D P(D)=30/5000

P(OK)=4970/5000

OK P(OK OK) OK

P(OK and OK) =(4970 • 4969)/(5000 • 4999)

=4969/4999

x 0 1 2 P(x) 0.9880348 0.0119304 0.0000348

Instructor’s Solutions Manual - Chapter 5

5. No. of customers who order the daily special at a restaurant, out of the next 6 customers x 0 1 2 3 4 5 6 P(x) 0.03 0.05 0.28 0.45 0.12 0.04 0.03 P(x=6) = 1 – 0.03 – 0.05 – 0.28 – 0.45 – 0.12 – 0.04 = 0.03   x  P ( x )  0  0.3  1  0.05  2  0.28  3  0.45  4  0.12  5  0.04  6  0.03  2.82   x 2 P ( x )   2  (0 2  0.03  12  0.05  2 2  0.28  3 2  0.45  4 2  0.12  5 2  0.04  6 2  0.03)  2.82 2  9.22  7.9524  1.2676  1.1259

Develop Your Skills 5.2 6. In this case, sampling is without replacement, but we assume the college has thousands of students. The sample size is probably much less than 5% of the population, so the binomial distribution can still be used to approximate the probabilities. According to the newspaper, p = 0.8 n = 10 P(x  4, n=10, p=0.8) = 0.0064. This is a very unlikely result, if the newspaper’s claim about 80% support is true. The evidence from the sample casts doubt on the newspaper’s claim. 7.

P(pass) =P(x13, n=25, p=1/5) =1 - P(x12) =1- 1 = 0 (using the tables) With Excel, we get the slightly more accurate result of 0.000369048. Whichever, it would be basically impossible to pass this test by guessing. Does this result change any ideas you might have had that multiple “guess” tests are easy?

Instructor’s Solutions Manual - Chapter 5

In this case, sampling is without replacement, but we assume the tire plant produces thousands and thousands of tires. The sample size is probably much less than 5% of the population, so the binomial distribution can still be used to approximate the probabilities. n = 20 p = 0.05 P(x=1) = P(x  1) – P(x  0) = 0.736 – 0.358 = 0.378, using the tables Using Excel, P(x=1) = 0.3774 Using the formula:  20  P( x  1)   0.051 0.9519 1 20!  0.051 0.9519 1!19!  20(0.05)(0.377353603)  0.3774

In this case, it is likely that the respondents to the poll on losing weight would not be a random sample, but rather a subset of the population of visitors to the site. Therefore, we should not apply the probability from this sample to all visitors to the site.

10. In this case, sampling is without replacement, but we assume there are thousands of managers in the population. The sample size from the poll is probably much less than 5% of the population, so the binomial distribution can still be used to approximate the probabilities. n = 30 p = 0.342 P(x 10) = 0.544782141, from Excel

Develop Your Skills 5.3 11.  = 5,000,  = 367 P( x  6000) 6000  5000    P z   367    P( z  2.72)  1  0.9967  0.0033 Only 0.33% of the bulbs will last more than 6000 hours. With Excel, we find P(x  6000) = 0.0032.

Instructor’s Solutions Manual - Chapter 5

12.  = 5,000,  = 367 P(x  unknown x-value) = 0.025 Check body of the tables for a value as close as possible to 0.025. This corresponds to a z-score of -1.96. x=+z• x = 5000 – 1.96 • 367 = 4280.68 If the company guarantees that the bulbs will last 4280 hours, then only (slightly less than) 2.5% of the bulbs will fail before they achieve that life. With Excel, we find that x = 4280.69. 13.  = $53,  = $9 a. Need to calculate P(x < $38) P( x  38) 38  53    P z   9    P(z  1.67)  0.0475 The probability that a randomly-selected warranty expense for these bikes would be less than $38 is 0.0475. With Excel, we find P(x < 38) = 0.0478.

Need to calculate P(38  x  62). P(38  x  62) 62  53   38  53  P z  9   9  P(1.67  z  1)  0.8413  0.0475  0.7938

The probability that a randomly-selected warranty expense for these bikes would be between $38 and $62 is 0.7938. With Excel, we find P(38  x  62) = 0.7936.

Instructor’s Solutions Manual - Chapter 5

P( x  68) 68  53    P z   9    P( z  1.67)  1  0.9525  0.0475

The probability that a randomly-selected warranty expense for these bikes would be above $68 is 0.0475. With Excel, we find P(x > 68) = 0.0478. 14.  = 232,000,000,  = 44,000,000 Since all the units are in millions, it is easy to work with units of millions, so we will restate:  = 232,  = 44 (millions). a. P( x  200) 200  232    P z   44    P( z  0.73)  0.2327 The probability that the trading volume will be less than 200 million shares is 0.2327. With Excel, we calculate 0.2335. The answer is slightly different because we have to round the z-score to use the tables. b.

If there is a 2% probability above x, then there is a 98% probability below x (we need to deal with left-side probabilities to use Excel or the tables). Inspect the body of the table, find the entry closest to 0.98, and identify the associated z-score. Closest entry in the table is 0.9798, with the z-score of 2.05. x = 232 + 2.05 • 44 = 322.2 Any trading volume of 322.2 million shares or more should trigger a press release that the trading volume is in the top 2%. With Excel, we calculate 322.365 million shares.

P( x  300) 300  232    P z   44    P( z  1.55)  1  0.9394  0.0606

Instructor’s Solutions Manual - Chapter 5

The trading volume exceeds 800 million shares 6.06% of the time. With Excel, we calculate 0.0611. 15.  = 32 seconds,  = 10 seconds We want to find an x-value with 10% probability below it. Search the body of the normal table for a value as close as possible to 0.10. The closest value is 0.1003 and the associated z-score is -1.28. x=+z• x = 32 – 1.28•10 = 19.2 Workers must be able to finish the task in 19.2 seconds to escape the weekend training. With Excel, we calculate 19.18448, or 19.2 seconds (to one decimal place). Chapter Review Exercises Solutions provided are based on the tables and by-hand calculations. Answers based on Excel will be more accurate, and may differ slightly from those arrived at with manual calculations.

1a. The number of magazines subscribed to by a Canadian household is neither a normal nor a binomial random variable. It is not binomial, because there are more than two possible outcomes. The number of magazine subscriptions could range from 0 to some highest possible number. The random variable is discrete, as the possible values are 0, 1, 2, 3, … n. As well, it is unlikely that the probability distribution could be approximated by the normal distribution, as it is likely to be rightly-skewed. That is, a few households are likely to subscribe to a higher number of magazines, while most households probably subscribe to only a few. Remember, not all random variables are either normal or binomial! 1b. In this case, the random variable would be binomial. There are only two possible outcomes: either the household subscribes to one or more magazines (success) or they do not subscribe to any magazines (failure). The poll would report the number of successes in 1,235 trials. The trials are not strictly independent, as sampling would be done without replacement. However, 1,235 households represent only a small portion of all Canadian households (definitely less than 5%), so the distribution will still be approximately binomial. 1c. The annual expenditure by Canadian households on magazine subscriptions is actually a discrete random variable (all possible values are in dollars and cents, and a value like $123.47869 is not possible). However, as discussed in the text, dollar amounts are often approximated by the normal distribution. As noted above, though, this distribution may not be normal. It is likely that most households subscribe to only a few magazines, leading to lower expenditures, while a few households might subscribe to many (or more expensive) magazines. Without some actual data, it is difficult to know the shape of the distribution.

Instructor’s Solutions Manual - Chapter 5

a. b. c.

P(x  3) = 0.11 + 0.06 + 0.03 = 0.2 P(x = 2 or 3) = 0.27 + 0.11 = 0.38  = 0•0.15 + 1•0.38 + 2•0.27 + 3•0.11 + 4•0.06 + 5•0.03 = 1.64

  x 2 P ( x )   2  (0 2  0.15  12  0.38  2 2  0.27  32  0.115  4 2  0.06  52  0.03)  1.64 2

 4.16  2.6896  1.4704  1.2126

0.20

R 0.80

0.20 1st student

0.80

0.20

RRR

0.80

RRRC

0.20

RRCR

RC RR R 0.80 2nd student 3rd student 0.20

RCRR

0.80

RCRRC

0.20

RCRCR

0.80

RCRCRC

RC 0.80

0.20

P(x=3) = 0.2•0.2•0.2 = 0.008 P(x=2) = 3•(0.2•0.2•0.8)=0.096 P(x=1) = 3•(0.2•0.8•0.8)=0.384 P(x=0) = 0.8•0.8•0.8 = 0.512 Probability Distribution for Number of Business Students Who Read the Financial Pages of the Daily Newspaper (out of 3) x 0 1 2 3 P(x) 0.512 0.384 0.096 0.008

Instructor’s Solutions Manual - Chapter 5

 = 3•0.008 + 2•0.096 + 1•0.384 + 0•0.512 = 0.6 2 = 32•0.008 + 22•0.096 + 12•0.384 + 02•0.512 – 0.62 = 0.84 – 0.36 = 0.48 so  = 0.48 = 0.6928 A graph of the probability distribution is shown below. Notice that since P(x=3) is so small, it barely shows on the graph.

Probability Distribution for the Number of Business Students Who Read the Financial Pages of the Daily Newspaper 0.6 0.5

P(x)

0.4 0.3 0.2 0.1 0 0

Number Out of Three

Instructor’s Solutions Manual - Chapter 5

0.40

S 0.60

0.40

2nd trial

1st trial

0.60

0.40

F 0.60

Probability Distribution for Binomial Random Variable, n=2, p=0.4 x 0 1 2 P(x) 0.36 0.48 0.16 Expected value = np = 2 • 0.4 = 0.8 Mean = 0 • 0.36 + 1 • 0.48 + 2 • 0.16 = 0.8 6. Probability Distribution for Value of Construction Company Contract x $50,000 -$1,845 P(x) 0.25 0.75  = $50,000•0.25 + (-$1,845)•0.75 = $12,500 - $1,383.75 = $11,116.25

Instructor’s Solutions Manual - Chapter 5

μ = 65, σ = 12

89.44% of the class passed. P ( x  50) 50  65    P z   12    P( z  1.25)  1  0.1056  0.8944 (Excel answer is the same.)

4.75% of the class received a mark of 45% or lower. P ( x  45) 45  65    P z   12    P( z  1.67)  0.0475 (Excel answer is 0.0478.)

69.11 % of the class received a mark between 50% and 75%. P(50  x  75)

75  65   50  65  P z  12   12  P( 1.25  z  0.83)  0.7967  0.1056  0.6911 (Excel answer is 0.6920.) d.

1.88% of the class received a mark of 90% or higher. P ( x  90) 90  65    P z   12    P( z  2.08)  1  0.9812  0.0188 (Excel answer is 0.0186.)

Instructor’s Solutions Manual - Chapter 5

In this case, sampling is without replacement, but we assume there are thousands of toothpaste customers. The sample size of 15 is probably much less than 5% of the population, so the binomial distribution can still be used to approximate the probabilities. a. p = 0.05, n = 15 P(x  3) = 0.995 (from tables) b.

n = 4, p = 0.05  4 P ( x  2)   0.052 0.952  2 4! 0.052 0.952 2!2! 43  (0.0025)( 0.9025) 2 1  0.0135 

9. a.

This is a normal probability problem.  = 840,  = 224 P ( x  1000) 1000  840    P z   224    P( z  0.71)  1  0.7611  0.2389 The probability that the printer produces more than 1,000 pages before this cartridge needs to be replaced is 0.2389.

P ( x  600) 600  840    P z   224    P( z  1.07)  0.1423 The probability that the printer produces fewer than 600 pages before this cartridge needs to be replaced is 0.1423.

We need to calculate an x-value, such that P(x  this x-value) = 0.95. This means there is 0.05 to the left of this x-value. Search the body of the normal table for a value as close as possible to 0.05 (and there is a “tie”—one entry is 0.0495 and one is 0.0505, and both are equally close to 0.05). Rather than approximate, go to Excel. NORMINV tells us that the x-value is 471.6 (the correct z-score is actually -1.64485). 95% of the time, the cartridges will produce at least 471.6 pages.

100

Instructor’s Solutions Manual - Chapter 5

10. The sampling is done without replacement, but we assume there are thousands of new computers being produced. The sample size of 5 is probably much less than 5% of the population, so the binomial distribution can still be used to approximate the probabilities. a.

n=5, p = 0.01 Using the tables is usually the fastest way to calculate this probability. P(x=1) = P(x  1) – P(x  0) = 0.999 – 0.951 = 0.048 The probability that one of the five computers will require service in the first 90 days is 0.048.

n=4, p = 0.01  4 P( x  1)   0.0110.993 1 4! 0.0110.993 1!3!  4(0.01)(0.970299)  0.0388 

The probability that one of the four computers will require service in the first 90 days is 0.0388. 11. In this case, sampling is done without replacement. Although investment banking is a highly specialized field, “all” investment bankers would be quite a large number. The sample size of 15 is probably much less than 5% of the population, so the binomial distribution can still be used to approximate the probabilities. a.

n=15, p = 0.25 P(x = 15) = P(x  15) – P(x  14) = 1 – 1 = 0 (from the table) The probability that all 15 have profited from insider information is 0. If we use Excel, we see that there is a very small probability associated with this outcome (0.00000000093).

n=15, p = 0.25 P(x  6) = 1 – P(x  5) = 1 – 0.852 = 0.148 (from the table) The probability that at least 6 have profited from insider information is 0.148.

n = 3, p = 0.25

101

Instructor’s Solutions Manual - Chapter 5

 3 P( x  1)   0.2510.752 1 3!  0.2510.752 1!2!  3(0.25)(0.5625)  0.4219 The probability that one of the three investment bankers profited from insider information is 0.4219. 12. normal distribution,  = 2.5%,  = 1.0% a. P( 2.5  x  3.5) 3.5  2.5   2.5  2.5  P z  1.0   1.0  P(0  z  1)  0.8413  0.5000  0.3413 The probability that a mutual fund has an expense fee of between 2.5% and 3.5% is 0.3413.

P( x  3) 3  2 .5    P z   1    P( z  0.5)  1  0.6915  0.3085

The probability that a mutual fund has expense fees greater than 3% is 0.3085. c.

We need to find an x-value such that there is a 90% probability to the left of the xvalue. Searching through the body of the normal table, we find an entry of 0.8997, with an associated z-score of 1.28. x=+z• x = 2.5 + 1.28 • 1 = 3.78 90% of mutual funds have expense fees below 3.78%.

102

Instructor’s Solutions Manual - Chapter 5

13. Normal distribution with  = $49,879 and  = $7,088 a.

P( 45000  x  50000) 50000  49879   45000  49879  P z  7088 7088    P( 0.69  z  0.02)  0.5080  0.2451  0.2629 The probability of a new graduate receiving a salary between $45,000 and $50,000 is 0.2629.

P ( x  55000) 55000  49879    P z   7088    P( z  0.72)  1  0.7642  0.2358 The probably of a new graduate getting a starting salary more than $55,000 is 0.2358.

Need to locate a salary such that the area to the left is 90%. Search the body of the normal table for an entry as close as possible to 0.90. The closest entry in the table is 0.8997, which has an associated z-score of 1.28. x=+z• x = 49,879 + 1.28 • 7,088 = $58,951.64 If you wanted to be earning more than 90% of new college graduates in computer information systems, you would have to earn $58,952.

14. In this case, sampling is done without replacement. If the probability of cheating applies to all college students, then the sample size of 175 would still be less than 5% of the population, so the binomial distribution can still be used to approximate the probabilities. n = 175, p = 0.043 P(x  1) = 1 – P(x = 0) 175  0.043 0 0.957 175 P( x  0)    0   0.957 175  0.0004567

103

Instructor’s Solutions Manual - Chapter 5

The probability that none of the students cheats is 0.0004567, so the probability that at least one of the students cheats is 1 - 0.0004567 = 0.9995. Excel provides the same answer, using BINOMDIST. 15. normal distribution,  = 10,000 and  = 2,525 a. With tables: P( x  12000) 12000  10000    P z   2525    P( z  0.79)  1  0.7852  0.2148 The percentage of the flood lamps that would last for more than 12,000 hours is 0.2148. With Excel: P(x>12000) = 1-P(x ≤ 12000) = 1- 0.785843 = 0.214157. b.

With tables: Need to find an x-value such that the probability to the left is 2%. Search the body of the table for an entry as close as possible to 0.02. There is an entry of 0.0202, with an associated z-score of -2.05. x=+z• x = 10,000 – 2.05 • 2,525 = 4,823.75 The manufacturer would advertise a lifetime of 4,823 hours, and only 2% of them will burn out before the advertised lifetime. Since the guaranteed hours are not a nice round number, the manufacturer may choose to use a value of 4,820 or even 4,800 hours instead. With Excel: Use NORMINV to get 4814.284.

16. In this case, sampling is done without replacement. Presumably there are quite a large number of frequent fliers. The sample size of 15 is probably much less than 5% of the population, so the binomial distribution can still be used to approximate the probabilities. a.

n = 15, p = 0.53 (We must use Excel for this calculation.) P(x  10) = 1 – P(x  9) = 1 – 0.7875 = 0.2125 The probability that at least 10 of the 15 had incomes over $65,000 a year is 0.21253.

n = 12, p = 0.53 P(x = 8) =0.1504 The probability that exactly 8 of the 12 had an income over $65,000 a year is 0.1504.

104

Instructor’s Solutions Manual - Chapter 5

17. normal distribution with  = $2,400 and  = $756 a.

With tables: P( x  1000) 1000  2400    P z   756    P( z  1.85)  0.0322 The proportion of the bills that are less than $1,000 is 3.22%. With Excel: P(x < 1000) = 0.032024.

With tables: P( x  1500) 1500  2400    P z   756    P( z  1.19)  1  0.1170  0.8830 The proportion of the bills that are more than $1,500 is 88.3%. With Excel: P(x > 1500) = 1 – P(x ≤ 1500) = 1 – 0.11693 = 0.88307.

With tables: Need to find an x-value such that the area to the right of it is 0.75. We have to work with left-sided probabilities, so we note that this means there is 0.25 to the left of the x-value. We search the body of the normal table for the value closest to 0.25; it is 0.2512. The associated z-score is -0.67. x=+z• x = $2,400 – 0.67 • $756 = $1,893.48 75% of the bills are more than $1,893.48. With Excel: Use NORMINV to find $1,890.09.

105

Instructor’s Solutions Manual - Chapter 5

18. In this case, sampling is done without replacement. Lindsay is a small town but the population is still many thousands. The sample size of 10 is much less than 5% of the population, so the binomial distribution can still be used to approximate the probabilities. a.

n = 10, p = 0.42 (We must use Excel for this question.) P(x=10) = 0.0002 The probability that all 10 of them are opposed to the proposed highway widening is 0.0002.

n = 10, p = 0.42 P(x=0) = 0.0043. The probability that none of them is opposed to the proposed highway widening is 0.0043.

n = 10, p = 0.42 P(x  5) = 0.7984 The probability that 5 or fewer of them are opposed to the proposed highway widening is 0.7984.

19. In this case, sampling is done without replacement. We have to come to some conclusion about the total number of customers at the bicycle store. The sample size of 25 has to be no more than 5% of the population for the binomial distribution to be used to approximate the probabilities. If the bicycle store has at least 20•25 = 500 customers, the binomial distribution can still be used. We will proceed with that assumption. P(x = 1, n = 25, p = 0.032) = 0.3665 (from Excel). 20. With tables: P ( x  45) 45  42    P z   12    P ( z  0.25)  1  0.5987  0.4013

The probability that a customer will have to wait more than 45 minutes for help to arrive is 0.4013. With Excel: P(x ≥ 45) = 1 – P(x < 45) = 1 – 0.598706 = 0.401294.

106

Instructor’s Solutions Manual - Chapter 5

21. a.

P(x  100, n = 200, p = 0.5) = 0.52817424 (from Excel)

 = np = 200•0.5 = 100  = npq  200  0.5  0.5  7.071067812

P(x  100, normal distribution with  = 100,  = 7.071067812) = 0.50 The probabilities are close, although not exactly the same. When p = 0.5, the binomial distribution has a symmetric shape like that of the normal distribution.

P(x  100.5) = 0.528186059, which is very close to the actual binomial probability calculated in part a. The continuity correction factor adjustment makes the normal and binomial probabilities very close.

107

Instructor’s Solutions Manual - Chapter 6

Chapter 6 Solutions Develop Your Skills 6.1 1. Claim about the population: salaries of graduates from the Business programs earn $40,000 a year. Sample result: sample mean is $39,368 for a random sample of 35 graduates If the college’s claim is true, sample means of one-year-after-graduation salaries would be normally distributed, with a mean of $40,000 and a standard deviation of $554. The sample mean is lower than expected. We need to calculate P( x $39,368) P ( x  39368) 39368  40000    P z   554    P ( z  1.14)  0.1271 It would not be highly unlikely to get a sample mean salary as low as $39,368, if the college’s claim about salaries is true. Since this is not an unexpected result, we do not have reason to doubt the college’s claim.

2. Claim about the population: at least 90% of customers would recommend the centre to friends Sample result: sample proportion is 87%, for 300 randomly-selected customers If the centre’s claim is true, sample proportions would be normally distributed, with a mean of 0.90 and a standard deviation of 0.01732. The sample proportion is lower than expected. We need to calculate P(SR0.87) P ( pˆ  0.87 ) 0.87  0.90    P z   0.01732    P ( z  1.73)  0.0418

The probability of getting a sample proportion as low as 95% is just a little over 4% (which is less than our cut-off of 5%). This is an unusual sample result, and not one we would expect if the actual proportion of satisfied customers is 90%. Therefore, the sample provides evidence to doubt the centre’s ad about the proportion of satisfied customers.

108

Instructor’s Solutions Manual - Chapter 6

There are two things to note here. First, 87% does not seem to be so far from 90%, so you might have been surprised at the result of the probability calculation. This should serve as a caution. You cannot make a decision about how “far away” a sample result is from what is expected, until you actually do a probability calculation. The other point is that although the proportion of satisfied customers may not be 90%, from the sample evidence, we suspect it will be around 87% or so. This suggests that although the centre’s claim may not be specifically correct, it still has a high percentage of satisfied customers. 3. Desired characteristic about the population: no more than 25% of employees would enrol in education programs Sample result: sample proportion is 26%, for 500 randomly-selected customers If the population percentage is actually 25%, sample proportions would be normally distributed, with a mean of 0.25 and a standard deviation of 0.021651. The sample proportion is higher than expected. We need to calculate P(SR0.26) P ( pˆ  0.26) 0.26  0.25    P z   0.021651    P ( z  0.46)  1  0.6772  0.3228

The probability of getting a sample proportion as high as 26%, when the population proportion is actually 25%, is over 32%. It would not be unusual to get such a sample proportion, if the actual population proportion is only 25%. The sample does not provide enough evidence to conclude that the actual percentage of employees who would enrol in such programs is more than 25%. On the basis of the sample results, the company should conclude that it can afford the programs and extend the benefit.

109

Instructor’s Solutions Manual - Chapter 6

4. Claim about the population: average mark on the mid-term statistics exam for students in the Business program is 67%. Sample result: a random sample of 30 of the Business students taking statistics has an average mark of 62%. If the teacher’s claim is true, the sample means would be normally distributed, with a mean of 67%, and a standard deviation of 3.2% The sample mean is lower than expected. We need to calculate P( x ≤62). P ( x  62) 62  67    P z   3 .2    P ( z  1.56)  0.0594

The probability of getting a sample mean as low as 62%, if the true population mean is actually 67%, is 0.0594. Such a sample result is not unusual. The sample does not provide enough evidence to suggest that the teacher’s claim overestimates the true average mark for the mid-term stats exam. Notice that the sample result is almost unusual enough for us to doubt the teacher's claim. However, we have established a rule for deciding when a sample result is unusual, and for long-run consistency in our decisions, we should follow the rule. 5. Claim about the population: average commuting time is 32 minutes. Sample result: a random sample of 20 commuters has an average commuting time of 40 minutes. If the true average commuting time is 32 minutes, the sample means would be normally distributed, with a mean of 32 minutes, and a standard deviation of 5 minutes. The sample mean is higher than expected. We need to calculate P( x 40). P ( x  40) 40  32    P z   5    P ( z  1 .6 )  1  0.9452  0.0548

The probability of getting a sample average commuting time as high as 40 minutes, if the actual population average commuting time is 32 minutes, is 0.0548. This result gives us pause. Such a sample result is not that usual—it will happen with a probability of only 5.48%. However, the cut-off we have decided to use for deciding what is “unusual” is a probability of 5% or less. This sample result does not meet that

110

Instructor’s Solutions Manual - Chapter 6

test. So, in this case, the sample does not give us enough evidence to conclude that the average commuting time has increased from 32 minutes. You may not be entirely comfortable with this decision. We will discuss this point further, when we talk about p-values (in Chapter 7). For now, stick to the rule, and later, you will find out more about how these decisions are made. Develop Your Skills 6.2 6. Claim about the population: average weight in cereal boxes is 645 grams. Sample result: a random sample of 10 cereal boxes has an average weight of 648 grams. We are told the cereal box weights are normally distributed, with  = 5 grams. If the true average weight is 645 grams, the sample means would be normally 5 distributed, with a mean of 645, and a standard deviation of grams. 10 P ( x  648)    648  645   P z   5   10    P (z  1.90)  1  0.9713  0.0287

If the cereal line were properly adjusted, the probability of getting a sample mean as high as 648 grams is 0.0287. Yet we did get this unlikely result. We have evidence that the cereal line is not working properly, and it should be adjusted. 7. Claim about the population: average salary of business program graduates one year after graduation is at least $40,000 a year Sample result: a random sample of 20 salaries of business program graduates one year after graduation has an average $38,000. We are told the salaries are normally distributed, with  = $3,300. If the true average salary is $40,000, the sample means would be normally distributed, 3300 with a mean of $40,000, and a standard deviation of . 20 P ( x  38000)    38000  40000   P z   3300   20    P ( z  2.71)  0.0034

111

Instructor’s Solutions Manual - Chapter 6

If the true average salary were $40,000, it would be almost impossible to get a sample mean as low as $38,000, under these conditions. Such a sample mean provides evidence that the average salary of graduates of the business program one year after graduation is less than $40,000. 8. Claim about the population: average tire life is 25,000 kilometres. Sample result: a random sample of 20 tires has an average life of 24,000 kilometres. We are told nothing about the population of tires. Sample data are not provided, so we cannot assess if the sample appears to be normally distributed. A sample size of 20 is not large enough to ensure normality of the sampling distribution unless the population is fairly normal. Without more information, we cannot proceed. 9. Claim about the population: it takes 1.5 working days on average to approve loan requests. Sample result: a random sample of 64 loan requests has an average of 1.8 working days. We are told the population are normally distributed, with  = 2.0. If the true average time for a loan to be approved is 1.5 working days, the sample means would be normally distributed, with a mean of 1.5, and a standard deviation of 2 . 64

P( x  1.8)    1.8  1.5   P z   2   64    P(z  1.2)  1  0.8849  0.1151 If the claim about the average time to approve the loan requests is true, the probability of getting this sample result would be 0.1151. This is not unusual, and so there is not enough evidence to conclude that the bank understates the average amount of time to approve loan requests.

112

Instructor’s Solutions Manual - Chapter 6

10. Claim about the population: mean weight of packages is 36.7 kg Sample result: a random sample of 64 packages has an average weight of 32.1 kg We are told the population are normally distributed, with  = 14.2 kg. If the true average weight of packages is 36.7 kg, the sample means would be 14.2 normally distributed, with a mean of 36.7 kg, and a standard deviation of . 64

P( x  32.1)    32.1  36.7   P z   14.2   64    P(z  2.59)  0.0048 The probability of getting a sample mean weight as low as 32.1 kg, if the average package weight is actually 36.7 kg, is only 0.0048. We would not expect to get this sample result, but we did. The sample result provides evidence that the average package weight may have decreased. Develop Your Skills 6.3 11. This is not really a random sample. It excludes anyone who eats the cereal but does not visit the website set up for the survey. As well, the sample is likely to be biased. People who did not find a free ticket in their cereal box are probably more likely to answer your survey. This sample data set cannot be reliably used to decide about the proportion of cereal boxes with a free ticket.

12. Claim about the population: p = 0.97 (proportion of graduates to find jobs in their fields within a year of graduation, from a particular college) Sample result: a random sample of 200 students reveals 5% who do not have a job in their field, so p̂ = 1 – 0.05 = 0.95 Sampling is done without replacement. Do we know that 200 graduates represent not more than 5% of graduating class? We do not, and so should proceed with caution. The binomial distribution may not be the appropriate underlying model here. We proceed by noting our assumption: that the sample of 200 graduates is not more than 5% of the graduating class. Check conditions: np = 200(0.97) = 194 nq = 200(0.03) = 6 nq is < 10 so normal approximation is not appropriate (the sampling distribution of p̂ should not be used).

113

Instructor’s Solutions Manual - Chapter 6

In the sample 95% (200) = 190 graduates got jobs in their field within a year of graduation. Using Excel, we calculate P(x <= 190, n = 200, p = 0.97) = 0.080779359 If 97% of graduates find jobs in their field within a year of graduation, it would not be unusual to get a sample result like this. The sample does not provide enough evidence to suggest that the percentage of graduates getting jobs in their field a year after graduation may be lower than claimed. 13. Claim about the population: p = 0.01 (proportion of defective tires is 1%) Sample result: a random sample of 500 tires reveals 8/500 = 1.6% that are defective Sampling is done without replacement. Presumably the company produces hundreds of thousands of tires, so we can be fairly confident that 500 tires is not more than 5% of the total population. The binomial probability distribution is still an appropriate underlying model. Check conditions: np = 500(0.01) = 5 nq = 500(0.99) = 495 np < 10, so the sampling distribution of p̂ should not be used. Using Excel and the binomial distribution, P(x  8, n = 500, p = 0.01) = 0.132319866 A sample proportion like the one we got would not be unusual, if in fact 1% of the tires are defective. There is not enough evidence to suggest that the rate of defective tires is more than 1%. Depending how the national survey was done, it might have had more response from those with defective tires. 14. Claim about the population: p = 0.80 (percentage of people who prevent a cold from developing, if he/she takes Cold-Over as soon as a sore throat/runny nose appear) Sample result: a random sample of 300 patients just developing cold symptoms are given Cold-Over, and 235 find the treatment successful Sampling is done without replacement. If Cold-Over is widely available, we presume that a very large number of people take the drug, and 300 patients is not more than 5% of the total population. The binomial probability distribution is still an appropriate underlying model. check conditions np = 300(0.80) = 240 nq = 300(0.20) = 60 both are  10 Since n is fairly large, at 300, the sampling distribution of p̂ can be used.

114

Instructor’s Solutions Manual - Chapter 6

Sampling distribution will be approximately normal, with mean = p = 0.80 pq (0.80)(0.20) standard error =  n 300 p̂ 

235  0.783333333 300

P(p̂  0.7833333)     0.7833333  0.80    P z  (0.80)(0.20)    300    P(z  0.72)  0.2358

This sample result would not be unusual, if the success rate for Cold-Over was 80%, as claimed. The sample does not provide enough evidence to suggest that the percentage of patients who take Cold-Over as directed and successfully prevent a cold from developing is less than 80% 15. Claim about the population: p = 0.40 (percentage of retired people who eat out at least once a week) Sample result: a random sample of 150 retired people in your city reveals 44 who eat out at least once a week Sampling is done without replacement. If the city is fairly large, we can presume that 150 retired people are not more than 5% of the total population of retired people. The binomial probability distribution is still an appropriate underlying model. Check conditions: np = 150(0.40) = 60 nq = 150(0.60) = 90 both are  10 Since n is fairly large, at 150, the sampling distribution of p̂ can be used. Sampling distribution will be approximately normal, with mean = p = 0.40 pq (0.40)(0.60) standard error =  n 150 p̂ 

44  0.29333333 150

115

Instructor’s Solutions Manual - Chapter 6

P(p̂  0.293333)     0.293333  0.40    P z  (0.40)(0.60)    150    P(z  2.67)  0.0038

It would be very unusual to get a sample result such as this one, if in fact 40% of retired people ate out at least once a week. The sample results suggest that fewer than 40% of retired people eat out at least once a week in your city. However, before deciding whether or not to focus on retired people, it would be important to know if this group tends to eat out more or less than other groups of people. While the percentage of those who eat out more than once a week is apparently lower in your city than in the survey, it still might be higher than for other groups, and might still be a good target market. Chapter Review Exercises 1. Your sketch should look something like the diagram below.

119.5

131.5

143.5

155.5

167.5

179.5

191.5

203.5

215.5

116

Instructor’s Solutions Manual - Chapter 6

The sampling distribution of the sample means (samples of size 25) will be normally distributed, because the heights in the population are normally distributed. The mean 12  2.4 cm. of the sample means will be 167.5 cm, and the standard error will be 25 The sampling distribution of the sample means will therefore be much narrower than the population distribution of heights. It will have to be much taller as well, since the total area under the distribution must be 1 (it is a probability distribution).

119.5

131.5

143.5

155.5

167.5

179.5

191.5

203.5

215.5

The sampling distribution of the sample means (samples of size 40) will be normally distributed, because the heights in the population are normally distributed. The mean of the sample means will be 167.5 cm, and the standard error will be 12  1.8974cm. It will be narrower still. 40

119.5

131.5

143.5

155.5

167.5

179.5

191.5

203.5

215.5

117

Instructor’s Solutions Manual - Chapter 6

2. a.

All three distributions are normally distributed, and all have a mean of 167.5 cm. There is greatest variability in the population distribution of heights. The sampling distribution of the means of 25 heights is much less variable, and the sampling distribution of the means of 40 heights is the least variable. For individual heights: P ( x  180) 180  167.5    P z   12    P ( z  1.04)  1  0.8508  0.1492

For samples of size 25: P ( x  180)    180  167.5   P z   12   25    P ( z  5.21) 0

For samples of size 40: P ( x  180 )    180  167 .5   P z   12   40    P ( z  6.59 ) 0

The answers are different because the distributions are different (as we saw in the answer to Exercise 1). Individual heights are the most variable, and so it is possible to find an individual student who is taller than 180 centimetres. However, it is much harder to find an average height of 25 students that is greater than 180 centimetres. This would require a sample with a lot of fairly tall students, and not too many short ones. This is much less likely. Finally, it is very unlikely that an average height of 40 students would be greater than 180 centimetres, because about 85% of students are shorter than that (see the answer to part a).

118

Instructor’s Solutions Manual - Chapter 6

p = 0.86 (claimed proportion of those taking the pill who get relief within 1 hour) p̂ = 287/350 = 0.82 n = 350 (fairly large) Sampling is done without replacement. Presumably, there are thousands and thousands of back pain sufferers who take this medication, so it is still appropriate to use the binomial distribution as the underlying model. Check for normality: np = 350(0.86) = 301 > 10 nq = 350(1-0.86) = 49 > 10 The binomial distribution could be approximated by a normal distribution, and so we can use the sampling distribution of p̂ . P( pˆ  0.82)     0.82  0.86   P z  (0.86)(0.14)    350    P( z  2.16)  0.0154

The probability of getting a sample result as extreme as the one we got, if the true proportion of sufferers who get relief within one hour is 86%, is 0.0154, which is less than 5%. The sample result qualifies as an unexpected or unusual event. Since we got this sample results, we have enough evidence to suggest that fewer than 86% of back pain sufferers who take this pill get relief within one hour. 4. p = 0.90 (claimed proportion customers who are satisfied with the range of food served and prices) p̂ = 438/500 = 0.876 n = 500 (fairly large) Sampling is done without replacement. The sample size is fairly large, at 500. We have no information about the total number of customers served by the cafeteria. As long as this sample is no more than 5% of the total population of customers, it is appropriate to use the binomial distribution as the underlying model. We proceed by noting that we are making this assumption, and that the conclusions are not valid if this assumption is not correct. Check for normality: np = 500 (0.90) = 450 > 10 nq = 500 (1 – 0.90) = 50 > 10

119

Instructor’s Solutions Manual - Chapter 6

The binomial distribution could be approximated by a normal distribution, so we can use the sampling distribution of pˆ . P( p̂  0.876)     0.876  0.90   P z  (0.90)(0.10)    500    P( z  1.79)  0.0367

The probability of getting a sample proportion as low as 87.6%, if in fact 90% of cafeteria customers were satisfied with the range of food served and prices, is less than 4%. Since we did get this unusual sample result, we have strong evidence that fewer than 90% of cafeteria customers are satisfied the range of food served and prices. 5. p = 0.17 (% of Americans whose primary breakfast beverage is milk) p̂ = 102/500 = 0.204 n = 500 (fairly large) Sampling is done without replacement. The sample size is 500. There are millions of Canadians who eat breakfast, and so the sample is certainly not more than 5% of the population. It is appropriate to use the binomial distribution as the underlying model. Check for normality: np = 500 (0.17) =85 > 10 nq = 500 (1 – 0.17) = 415 > 10 The binomial distribution could be approximated by a normal distribution, so we can use the sampling distribution of pˆ . P ( pˆ  0.204 ) 0.204  0.17    P z   0.016799    P ( z  2.02)  1  0.9783  0.0217

There would not be much of a chance of getting a sample proportion as high as 20.4% if the true proportion of Canadians who drink milk as the primary breakfast beverage were actually 17%. The sample evidence gives us reason to think that the proportion of Canadians who choose milk as their primary breakfast beverage may be higher than the percentage of Americans who do so.

120

Instructor’s Solutions Manual - Chapter 6

6. p = 0.65 (% of visitors who had an enjoyable experience) p̂ = 282/400 = 0.705 n = 400 (fairly large) Sampling is done without replacement. The sample size is 400, but we are told the attraction gets 50,000 visitors a year. 400 is less than 5% of the population of 50,000. It is appropriate to use the binomial distribution as the underlying model. Check for normality: np = 400(0.65) = 260 > 10 nq = 400(1 – 0.65) = 140 > 10 The binomial distribution could be approximated by a normal distribution, so we can use the sampling distribution of pˆ . P ( pˆ  0.705) 0.705  0.65    P z   0.023848    P ( z  2.31)  1  0.9896  0.0104 There is only a very small probability of getting a sample result as extreme as this one, if the percentage of visitors who had an enjoyable experience is 65%. The sample provides evidence that the percentage of visitors who had an enjoyable experience has increased. It seems reasonable to assume that the upgrades are the cause, but this is not something that can be concluded directly from these data.

7. x = $756  = $132  = $700 (the average cost of textbooks per semester for a college student) n = 75 We are told that the population data are normally distributed, so the sampling distribution will also be normal, with a mean of $500, and a standard error of 132 x  . 75

121

Instructor’s Solutions Manual - Chapter 6

P( x  756)

   756  700   P z   132   75    P( z  3.67)  1  0.9999  0.0001 There is almost no chance of getting a sample mean as high as $756 if the population average textbook cost is $700. The sample provides evidence that the average cost of textbooks per semester for college students has increased.

8. x = $8400  = $3700  = $7500 (the average total withdrawal at the ATM over the weekend) n = 36 We are told to assume the population data are normally distributed, so the sampling distribution will also be normal, with a mean of $7500, and a standard error of 3700 x  . 36 P ( x  8400 )

   8400  7500   P z   3700   36    P ( z  1.46)  1  0.9279  0.0721

If the true average total withdrawal is $7500, the probability of getting a sample mean total withdrawal of $8400 is 0.0721. This sample result is not unexpected enough to conclude that the bank manager’s claim about the total average withdrawal over the weekend is too low.

122

Instructor’s Solutions Manual - Chapter 6

Start by analyzing the sample data set. x = 6188.3875 hours s = 234.6176  = 6200 hours (claimed average lifespan of electronic component) n = 40 Histogram of sample data is shown below.

Number of Components

Lifespans of a Random Sample of 40 Electronic Components 18 16 14 12 10 8 6 4 2 0 Lifespan in Hours

The histogram appears to be fairly normal, with some skewness to the right. The sample size is fairly large, at 40, so it seems likely the sampling distribution would be normally distributed. The sampling distribution would have a mean of 6200, and a 234.6176 standard error of  x  . 40 We are told to assume  = s = 234.6176. P ( x  6188.3875)    6188.3875  6200   P z   234.6176   40    P ( z  0.31)  0.3783 The probability of getting a sample result as small as 6188.3875 hours, if the true average lifespan of the components is 6200, is 0.3783. This sample result is not unusually small. The sample evidence does not give us reason to doubt the producer’s claim that the average lifespan of the electronic components is 6200 hours. Copyright © 2011 Pearson Canada Inc.

123

Instructor’s Solutions Manual - Chapter 6

10. Start by analyzing the sample data set. x = 36242.22 s = 5026.95  = $37,323 (claimed average income of business grads the year after graduation) n = 45 Histogram of sample data is shown below.

Random Sample of 45 Graduates of Business Program 14

Number of Graduates

12 10 8 6 4 2 0 Annual Income in the First Year After Graduation

This histogram is significantly skewed to the left, and appears to be somewhat bimodal, with a mode in each of the lower and upper halves of the range of the data. In this case, we cannot assume the sampling distribution of x would be normal, and so we cannot proceed.

124

Instructor’s Solutions Manual - Chapter 6

11. Start by analyzing the data set. x = 1756.48 s = 599.773  = $2000 (claimed average daily sales) n = 29

The histogram is skewed to the right. The sample size, at 29, is reasonably large. We will proceed by assuming that with this sample size, the sampling distribution will be approximately normal. P ( x  1756.48)    1756.48  2000   P z   599.773   29    P ( z  2.19 )  0.0144 The probability of getting average daily sales as low as $1,756.48, if the true average daily sales are $2,000, is quite small, at 1.44%. The fact that we got this unusual sample result gives us reason to doubt the former owner's claim that average daily sales at the shop were $2,000. However, it may be that the sales have changed under the new owner, for a variety of reasons. The data cannot allow us to conclude that the former owner misrepresented the daily sales figures.

125

Instructor’s Solutions Manual - Chapter 6

12. p = 0.25 n=400 number of successes = 82 (where "success" is defined as a student living in the immediate area) Sampling is done without replacement. The sample size is 400, but we are told the college has over 10,000 students. 400 is less than 5% of the population of 10,000. It is appropriate to use the binomial distribution as the underlying model. Check for normality: np = 400(0.25) = 100 > 10 nq = 400(1 – 0.25) = 300 > 10 The binomial distribution could be approximated by a normal distribution, so we can use the sampling distribution of p̂ . a.

P(x ≤ 82, n = 400, p = 0.25) = 0.0200

p̂ = 82/400 = 0.205

P ( pˆ  0.205) 0.205  0.25    P z   0.021651    P ( z  2.078 )

 0.0188

The probabilities calculated in parts a and c are different because the probability in part c is the normal approximation of the binomial probability calculated in part a. The probabilities are close in value, but not exactly the same.

There is evidence that the percentage of college students from the immediate catchment area is lower than in the past. The probability of getting a sample result as low as 82 out of 400 is about 2%, presuming that 25% of students come from the local catchment area. This sample result is unusually low, and the fact that we got it provides evidence that the percentage of students from the local area has declined.

126

Instructor’s Solutions Manual - Chapter 6

13. p = 0.20 n=300 number of successes = 77 (where "success" is defined as a student with a laptop) Sampling is done without replacement. The sample size is 300, and we have no information about the total number of students at the college. We can proceed, first by noting that we are assuming there are at least 300•20=6000 students at the college. We will use the binomial distribution as the underlying model. Check for normality: np = 300(0.20) = 60 > 10 nq = 300(1 – 0.20) = 240 > 10 The binomial distribution could be approximated by a normal distribution, so we can use the sampling distribution of p̂ . P ( pˆ  0.2567 ) 0.2567  0.20    P z   0.023094    P ( z  2.45)  1  0.9929  0.0071

It would be almost impossible to find 77 laptops among 300 students, if the claimed percentage ownership was still 20%. Since we found this result, we have evidence to suggest that the percentage of students with laptop computers is now more than 20%. 14. Use Excel's Histogram tool to create a frequency distribution of 0's and 1's. The results are as follows (the total was calculated with an Excel formula). Bin

Frequency 0 25 1 29 Total 54

n = 54 p̂ = 29/54 = 0.537037 p=0.6 Sampling is done without replacement. The sample size is 54, and we have no information about the total number of visitors to the winery's retail shop. We can proceed, first by noting that we are assuming there are at least 54•20=1080 visitors to the winery. We will use the binomial distribution as the underlying model.

127

Instructor’s Solutions Manual - Chapter 6

Check conditions: np = 54(0.6) = 32.4 > 10 nq = 54(1-0.6) = 21.6 > 10 P ( pˆ  0.537037 ) 0.537037  0.60    P z   0.066667    P ( z  0.9444 )

 0.1725

(using Excel; the answer is 0.1736 using the tables)

The probability of getting a sample proportion of female visitors as low as 53.7%, if the actual proportion of females is actually 60%, is over 17%. Although the proportion of females in the sample is lower than expected, it is not unusually low. The sample does not give us evidence that the proportion of female customers is lower than the owner believes.

128

Instructor’s Solutions Manual - Chapter 7

Chapter 7 Solutions Develop Your Skills 7.1 1. a. H0: p = 0.05 H1: p < 0.05 b. A Type I error arises when we mistakenly reject the null hypothesis when it is in fact true. This would correspond to accepting a shipment of keyboards when 5% or more of them were defective. A Type II error arises when we mistakenly fail to reject the null hypothesis when it is in fact false. This would correspond to refusing to accept a shipment of keyboards that actually had fewer than 5% defective. c. From the manufacturer’s point of view, Type I error is probably more important, because it would lead to using more faulty keyboards than desired. This would likely lead to customer complaints, and might hurt the company’s quality reputation. d. From the supplier’s point of view, the Type II error would be more frustrating, because a shipment that should have been accepted was returned. 2.

a. b.

H0: µ = 15 minutes H1: µ < 15 minutes A Type I error arises when we mistakenly reject the null hypothesis when it is in fact true. This would correspond to concluding that the new pain reliever provided quicker pain relief than the old formula, when in fact it didn’t. A Type II error arises when we mistakenly fail to reject the null hypothesis when it is in fact false. This would correspond to concluding that the new pain reliever did not provide quicker pain relief than the old formula, when in fact it did. If the company reported that it rejected the null hypothesis at a 10% significance level, I would not be inclined to switch to the new drug (although I would like to know the p-value to make my final decision). Such a high level of significance makes it easy to reject the null, and so there is a higher chance that it is in fact true. A p-value of 1% tells me the sample result would have occurred with a probability of only 1%, if the null hypothesis were true. This is very unlikely, yet it occurred. With such a result, there is strong evidence that the new pain reliever provides quicker relief than the old formula. H0: µ = 142 ml H1: µ ≠ 142 ml This should be a two-tailed test. Both underfilled and overfilled cans of peaches present problems. A Type I error arises when we mistakenly reject the null hypothesis when it is in fact true. This would correspond to concluding the wrong amount of peaches was going into the cans, and making some adjustments, when in fact everything was fine, and no adjustments were necessary.

129

Instructor’s Solutions Manual - Chapter 7

A Type II error arises when we mistakenly fail to reject the null hypothesis when it is in fact false. This would correspond to concluding that the right amount of peaches were going into the cans, and making no adjustments, when in fact the cans were being either underfilled or overfilled, and some adjustment was necessary. If I were a consumer the underfilled cans would be most important to me! Type II errors are more important, particularly if they led to underfilled cans of peaches. This could also be a consequence of Type I errors.

np = 500(0.35) = 175 nq = 500(1=0.35) = 325 Both are > 10, so the sampling distribution of p̂ will be approximately normal. Although sampling is done without replacement, we are told the sample is less than 5% of the population. p̂ = 0.36 This is a right-tailed test. To get the p-value, calculate P(sample statistic ≥ observed sample result).

P(p̂ ≥ 0.36) ⎛ ⎞ ⎜ ⎟ 0.36 − 0.35 ⎟ ⎜ = P⎜ z ≥ (0.35)(0.65) ⎟ ⎜⎜ ⎟⎟ 500 ⎝ ⎠ = P(z ≥ 0.47 ) = 1 − 0.6908 = 0.3192 The p-value is 0.3192. 5.

This is now a two-tailed test, so the p-value = 2 • 0.3192 = 0.6384.

Develop Your Skills 7.2 6. H0: p = 0.25 H1: p > 0.25 α = 0.05 p̂ = 0.27 n = 1006 Sampling is done without replacement. The population is Canadian homeowners, of which there are millions, so the sample of 1006 is less than 5% of the population. np = 1006(0.25) = 251.5 nq = 1006(1 – 0.25) = 754.5

130

Instructor’s Solutions Manual - Chapter 7

Both are ≥ 10, so the sampling distribution of p̂ will be approximately normal, with a mean of 0.25, and a standard error of σ p̂ =

pq (0.25)(0.75) = = 0.013652169 . n 1006

P(p̂ ≥ 0.27 ) ⎛ ⎞ ⎜ ⎟ 0.27 − 0.25 ⎟ ⎜ = P⎜ z ≥ (0.25)(0.75) ⎟ ⎜⎜ ⎟⎟ 1006 ⎝ ⎠ = P(z ≥ 1.46) = 1 − 0.9279 = 0.0721 p-value = 0.0721 > α = 0.05 We fail to reject H0. There is insufficient evidence to infer that more than a quarter of homeowners spend more than they planned on home renovation projects. 7.

H0: p = 0.3333 H1: p > 0.3333 α = 0.02 p̂ = 0.34 n = 1006 Sampling is done without replacement. The population is Canadian homeowners, of which there are millions, so the sample of 1006 is less than 5% of the population. np = 1006(0.3333) = 335.3333 nq = 1006(1 – 0.3333) = 670.6667 Both are ≥ 10, so the sampling distribution of p̂ will be approximately normal, with a mean of 0.3333, and a standard error of pq (0.3333)(0.6667) σ p̂ = = = 0.014862599 . n 1006

P(p̂ ≥ 0.34) ⎛ ⎞ ⎜ ⎟ 0.34 − 0.3333 ⎟ ⎜ = P⎜ z ≥ (0.3333)(0.6667) ⎟ ⎜⎜ ⎟⎟ 1006 ⎝ ⎠ = P(z ≥ 0.45) = 1 − 0.6736 = 0.3264

131

Instructor’s Solutions Manual - Chapter 7

p-value = 0.3264 > α = 0.02 We fail to reject H0. There is insufficient evidence to infer that more than a third of homeowners borrow to renovate. 8.

H0: p = 0.15 H1: p > 0.15 α = 0.04 p̂ = 0.17 n = 1403 Sampling is done without replacement. However, there are probably millions of Canadian cell phone users, so the sample is not more than 5% of the population. np = 1403(0.15) = 210.45 nq = 1403(1 – 0.15) = 1192.55 Both are ≥ 10, so the sampling distribution of p̂ will be approximately normal, with a mean of 0.50, and a standard error of σ pˆ =

pq (0.15)(0.85) = = 0.009533 n 1403

P ( pˆ ≥ 0.17 ) ⎛ ⎞ ⎜ ⎟ 0.17 − 0.15 ⎟ ⎜ =P z≥ ⎜ (0.15)(0.85) ⎟ ⎜ ⎟ 1403 ⎝ ⎠ = P ( z ≥ 2.10) = 1 − 0.9821 = 0.0179

p-value = 0.0179 < α = 0.04 We reject H0. There is sufficient evidence to infer that more than 15% of all Canadian cellphone and smartphone users typically access the internet on a daily basis from their phones. 9.

H0: p = 0.10 H1: p < 0.10 α = 0.05 p̂ = 18/200 = 0.09 n = 200 Sampling is done without replacement. We have no information about the total number of customers in the store, but presumably it would be thousands, so the sample of 200 is not more than 5% of the population. np = 200(0.10) = 20 nq = 200(1 – 0.10) = 180

132

Instructor’s Solutions Manual - Chapter 7

Both are ≥ 10, so the sampling distribution of p̂ will be approximately normal, with a mean of 0.10, and a standard error of σp̂ =

pq (0.1)(0.9) = = 0.02123203 n 200

133

Instructor’s Solutions Manual - Chapter 7

10. H0: p = 0.25 H1: p > 0.25 α = 0.04 p̂ = 14/50 = 0.28 n = 50 Sampling is done without replacement. We have no information about the total number of customers in the diner. As long as the diner has at least 1000 customers, the sample of 50 would be not more than 5% of the population. np = 50(0.25) = 12.5 nq = 50(1 – 0.25) = 37.5 Since np and nq are both, the sampling distribution of p̂ will be approximately normal, with a mean of 12.5 and a standard error of pq (0.25)(0.75) σ p̂ = = = 0.061237244 n 50

P(p̂ ≥ 0.28) ⎛ ⎞ ⎜ ⎟ 0.28 − 0.25 ⎟ ⎜ = P⎜ z ≥ (0.25)(0.75) ⎟ ⎜⎜ ⎟⎟ 50 ⎝ ⎠ = P(z ≥ 0.49) = 1 − 0.6879 = 0.3121 p-value = 0.3121 > α = 0.04 We fail to reject H0. There is insufficient evidence to infer that more than 25% of customers choose salad instead of fries with their main course. Develop Your Skills 7.3 11. H0: µ = $50,000 H1: µ > $50,000 α = 0.03 With Excel, we calculate x = 50356 s = 7962.922669 n = 40 (given) A histogram of the data is somewhat skewed to the right. However, the sample size is fairly large, at 40, and so this is “normal enough” to use the t-distribution.

134

Instructor’s Solutions Manual - Chapter 7

Household Incomes in a Halifax Suburb

Number of Households

12 10 8 6 4 2 0 Annual Income

Since we are using Excel, it makes sense to use the template. Results are shown below. Making Decisions About the Population Mean with a Single Sample Do the sample data appear to be normally distributed? Sample Standard Deviation s Sample Mean Sample Size n Hypothetical Value of Population Mean t-‐Score One-‐Tailed p-‐Value Two-‐Tailed p-‐Value

yes 7962.92 50356.00 40 50000 0.28275 0.38943 0.77886

This is a one-tailed test. p-value = 0.3894 > α = 0.03 We fail to reject H0. There is insufficient evidence to infer that average household incomes in this particular suburb were more than $50,000 a year.

135

Instructor’s Solutions Manual - Chapter 7

12. H0: µ = $33 H1: µ > $33 α = 0.03 x = $34.21 s = $10 n = 500 (half of the 1000 people surveyed) We are told that the sample data are approximately normally distributed, and so we will assume the population data are, as well.

P ( x ≥ 34.21) ⎛ ⎞ ⎜ 34.21 − 33 ⎟ = P⎜ t ≥ ⎟ 10 ⎜ ⎟ 500 ⎠ ⎝ = P(t ≥ 2.706) This is a one-tailed test. The t-distribution has 499 degrees of freedom. For this, we will use the table row for 200 degrees of freedom. p-value = P(t ≥ 2.706) < 0.005 We reject H0. There is sufficient evidence to infer that all Canadian women spend more than $33, on average, when they make quick trips to the grocery store. 13. H0: µ = $37876 H1: µ < $37876 α = 0.02 Using Excel, we find x = 35238 s = 2752.578754 n = 50 A histogram of the data is shown below.

Salaries of a Random Sample of Entry-‐Level Clerks in the Area 16 14

Number of Clerks

12 10 8 6 4 2 0 Annual Salary

136

Instructor’s Solutions Manual - Chapter 7

The histogram is unimodal and fairly symmetric. The sample size is fairly large, at 50. It is appropriate to use the t-distribution. Since we are using Excel, we will use the template. It is shown below. Making Decisions About the Population Mean with a Single Sample Do the sample data appear to be normally distributed? Sample Standard Deviation s Sample Mean Sample Size n Hypothetical Value of Population Mean t-‐Score One-‐Tailed p-‐Value Two-‐Tailed p-‐Value

yes 2752.58 35238 50 37876 -‐6.7767 7.4E-‐09 1.5E-‐08

This is a one-tailed test, so the p-value is 7.4 • 10-9, or 0.0000000074, which is very small. It would be almost impossible to get the sample mean that was obtained from the sample if the average salary of entry-level clerks in the area was actually $37,876. The fact that we did get this sample mean provides strong evidence against the null hypothesis. The p-value < α = 0.02. We reject H0. There is strong evidence to infer that average salary of clerks in the area is lower than $37,876. In other words, the average salary of entry-level clerks in the company is higher than in the area.

137

Instructor’s Solutions Manual - Chapter 7

14. H0: µ = 10 H1: µ < 10 α = 0.04 x = 8.9 s = 5.5 n = 60 We are told that the sample data appear normally distributed, so we will assume the population data are normally distributed. P( x ≤ 8.9) ⎛ ⎞ ⎜ 8.9 − 10 ⎟ = P⎜ t ≤ ⎟ 5.5 ⎜ ⎟ 60 ⎠ ⎝ = P( t ≤ −1.55)

We know P(t ≤ -1.55) = P(t ≥ +1.55) We need to refer to the t-distribution with 59 degrees of freedom. The closest entries in the t-table are in the row for 60 degrees of freedom. A t-score of 1.55 would be located between t.100 and t.050. 0.050 < P(t ≥ + 1.55) < 0.100 This is a one-tailed test, so 0.050 < p-value < 0.100 p-value > α = 0.04 We fail to reject H0. There is insufficient evidence to contradict the company’s idea that most baby boomer households have stereo equipment that is at least 10 years old.

138

Instructor’s Solutions Manual - Chapter 7

15. H0: µ = $85 H1: µ > $85 α = 0.05 x = $87.43 s = $16.23 n = 15 We are told to assume the population data are normally distributed.

P( x ≥ 87.43) ⎛ ⎞ ⎜ 87.43 − 85 ⎟ = P⎜ t ≥ ⎟ 16.23 ⎜ ⎟ 15 ⎠ ⎝ = P(t ≥ 0.580) We need to refer to the t-distribution with 14 degrees of freedom. A t-score of 0.580 would be located to the left of t.100. This is a one-tailed test, so p-value > 0.100 > α = 0.05 We fail to reject H0. There is insufficient evidence to indicate that the average price of the company’s competitors is higher than $85 for cleaning a bedroom carpet. If the company wants to be sure that its rates are lower, it must decrease them. Chapter Review Exercises 1. a. H0: p = 0.03 H1: p < 0.03 b. A Type I error occurs when we mistakenly reject H0 when it is in fact true. In this case, this would correspond to concluding that the shipment contained fewer than 3% defectives, when this was not the case. A Type II error occurs when we fail to reject H0 when it is in fact false. In this case, this would correspond to rejecting the shipment when in fact it contained fewer than 3% defectives. c. From the toy manufacturer’s point of view, Type I error would be important, because it would lead to a higher rate of defective components installed in the toys, which might lead to customer complaints. Type II errors would also have consequences, because it might mean unnecessary delays in production, and difficult relations with suppliers. However, Type I error is probably more important.

139

Instructor’s Solutions Manual - Chapter 7

a. b.

H0: µ = 10.6 litres per 100 kilometres H1: µ < 10.6 litres per 100 kilometres A Type I error occurs when we mistakenly reject H0 when it is in fact true. In this case, this would correspond to concluding that the new gasoline gave better gas mileage, when in fact it did not. A Type II error occurs when we fail to reject H0 when it is in fact false. In this case, this would correspond to failing to conclude that the new gasoline gave better gas mileage, when in fact it did. H0: µ = 750 ml H1: µ ≠ 750 ml This is a two-tailed test, because there are problems if the water bottles contain either too much or too little water. A Type I error occurs when we mistakenly reject H0 when it is in fact true. In this case, this would correspond to concluding that the bottles do not contain the correct amount of water (and probably adjusting something), when in fact they were fine. A Type II error occurs when we fail to reject H0 when it is in fact false. In this case, this would correspond to failing to notice when the bottles did not contain the correct amount of water, which might lead to customer complaints (or lower profits). As a consumer, you would probably be most concerned about Type II errors, particularly when they led to underfilled bottles. Underfilled bottles could also be a consequence of Type I errors.

P( x ≤ 296.5) ⎛ ⎞ ⎜ 296.5 − 300 ⎟ = P⎜ t ≤ ⎟ 35.6 ⎜ ⎟ 40 ⎠ ⎝ = P( t ≤ −0.622) P(t ≤ -0.622) = P(t ≥ + 0.622) There is no row in the t-table for 39 degrees of freedom. We will use the row for 40 degrees of freedom (the closest available). If we were to place a t-score of 0.622 in the table, it would be to the left of t.100. So we know P(t ≥ 0.622) > 0.100. Since this is a one-tailed test, this means p-value > 0.100.

140

Instructor’s Solutions Manual - Chapter 7

p − value = 2 • P( pˆ ≥ 0.30) ⎛ ⎞ ⎜ ⎟ 0.30 − 0.26 ⎟ ⎜ = 2•P z ≥ ⎜ (0.26)(0.74 ) ⎟ ⎜ ⎟ 200 ⎝ ⎠ = 2 • P( z ≥ 1.29 ) = 2 • 0.0985 = 0.197 6.

a. b.

c. d.

H0: µ = 48.2 H1: µ < 48.2 A Type I error occurs when we mistakenly reject H0 when it is in fact true. In this case, this would correspond to concluding that the efforts to reach younger listeners had been successful, when in fact they had not been. A Type II error occurs when we fail to reject H0 when it is in fact false. In this case, this would correspond to concluding that the efforts to reach younger listeners had failed, when in fact they had succeeded. This might lead to making greater efforts to reach younger listeners, when such efforts were not necessary. If the null hypothesis were rejected with a p-value of 1%, there would be strong evidence that the radio station had succeeded in reaching a younger audience. If the sample mean is 46, it would appear that the younger listeners are not that much younger than before! In this case, it would probably have been better to set a new target age, and test to see if the average age was below this. Having younger listeners than previously is probably not the correct goal.

141

Instructor’s Solutions Manual - Chapter 7

H0: µ = $725 H1: µ < $725 α = 0.05 We are told to assume that the population data are normally distributed.

x = $641 s = $234 n = 2711 P( x ≤ 641) ⎛ ⎞ ⎜ 641 − 725 ⎟ = P⎜ t ≤ ⎟ 234 ⎜ ⎟ 2711 ⎠ ⎝ = P(t ≤ −18.691) Of course, there is no row in the t-table for 2710 degrees of freedom. If we look at the last row in the table, we can see that the p-value < 0.005 (probably much less). This sample result would be practically impossible to get, if Canadian internet shoppers were actually spending an average of $725 shopping online annually. p-value < α, so reject H0. There is very strong evidence that the annual internet spending by Canadians is less than $725. 8.

H0: p = 0.90 H1: p < 0.90 α = 0.04 In Chapter 6, we noted that we had to assume that the sample of 500 is no more than 5% of the total population, an assumption we cannot check. Also, we noted that np and nq ≥ 10, so the sampling distribution of p̂ would be approximately normal. We also calculated P( pˆ ≤ 0.876) ⎛ ⎞ ⎜ ⎟ 0.876 − 0.90 ⎟ ⎜ =P z≤ ⎜ (0.90)(0.10) ⎟ ⎜ ⎟ 500 ⎝ ⎠ = P( z ≤ −1.79 ) = 0.0367

This is a one-tailed test. The p-value < α , so there is evidence that the proportion of the cafeteria’s customers who are satisfied with the range of food served and prices is less than 90%.

142

Instructor’s Solutions Manual - Chapter 7

H0: p = 0.17 H1: p > 0.17 α = 0.01 In Chapter 6, we noted that the sample of 500 was no more than 5% of the total population. Also, we noted that np and nq ≥ 10, so the sampling distribution of p̂ would be approximately normal. We also calculated P ( pˆ ≥ 0.204) ⎛ ⎞ ⎜ ⎟ 0.204 − 0.17 ⎟ ⎜ =P z≥ ⎜ (0.17 )(0.83) ⎟ ⎜ ⎟ 500 ⎝ ⎠ = P ( z ≥ 2.02 ) = 1 − 0.9783 = 0.0217

This is a one-tailed test. The p-value =0.0217 > α = 0.01. Fail to reject H0. There is not enough evidence to conclude that the proportion of Canadians who drink milk as the primary breakfast beverage is greater than 17%, the percentage of Americans who drink milk as the primary breakfast beverage. Note that this is not the same conclusion we drew in Chapter 6, where the implied level of significance was 5%. In this exercise, we have used a smaller level of significance. Under these new conditions, it is harder to reject the null hypothesis. 10. H0: p = 0.65 H1: p > 0.65 α = 0.03 In Chapter 6, we noted that the sample of 400 was not more than 5% of the total number of visitors. Also, we noted that np and nq ≥ 10, so the sampling distribution of p̂ would be approximately normal. We also calculated P( pˆ ≥ 0.705) ⎛ ⎞ ⎜ ⎟ 0.705 − 0.65 ⎟ ⎜ =P z≥ ⎜ (0.65)(0.35) ⎟ ⎜ ⎟ 450 ⎝ ⎠ = P( z ≥ 2.31) = 1 − 0.9896 = 0.0104

143

Instructor’s Solutions Manual - Chapter 7

The p-value < α, so there is evidence that the proportion of visitors who felt they had an enjoyable experience was more than 65%. As noted earlier, this does not prove that the upgrades to the tourist attraction caused the change, although it might be the case. 11. H0: p = 0.05 H1: p > 0.05 α = 0.02 Sampling is done without replacement, and sample size is 500. As long as the sample of 500 is not more than 5% of the total population of employees, it is appropriate to use the binomial distribution as the underlying model. We proceed by noting that we are making this assumption, and that the conclusions are not valid if this assumption is not correct. np = 500 (0.05) = 25 nq = 500 (1 – 0.05) = 475 Both are ≥ 10, so the sampling distribution of p̂ will be approximately normal. p̂ = 38/500 = 0.076

P(p̂ ≥ 0.076) ⎛ ⎞ ⎜ ⎟ 0.076 − 0.05 ⎟ ⎜ = P⎜ z ≥ (0.05)(0.95) ⎟ ⎜⎜ ⎟⎟ 500 ⎝ ⎠ = P(z ≥ 2.67 ) = 1 − 0.9962 = 0.0038 p-value = 0.0038 < α = 0.02 We reject H0. There is sufficient evidence to suggest that the proportion of employees who would use the tuition subsidy program is greater than 5%.

144

Instructor’s Solutions Manual - Chapter 7

12. H0: µ = $700 H1: µ > $700 α = 0.05 We are told to assume that the population data are normally distributed.

x = $756 s = $132 n = 75 P( x ≥ 756) ⎛ ⎞ ⎜ 756 − 700 ⎟ = P⎜ t ≥ ⎟ 132 ⎜ ⎟ 75 ⎠ ⎝ = P(t ≥ 3.67 ) There are no entries in the t-table for 74 degrees of freedom. However, whether we look at the row for 70 degrees of freedom, or 80 degrees of freedom, we make the same conclusion: a t-score of 3.67 is to the right of t.005. So we can conclude that P(t ≥ 3.67) < 0.005. We reject H0. There is sufficient evidence to suggest that the average cost of textbooks for college students has increased.

145

Instructor’s Solutions Manual - Chapter 7

13. H0: p = 0.5 H1: p > 0.5 α = 0.05 Sampling is done without replacement, and sample size is 500. As long as the sample of 500 is not more than 5% of the total population of customers, it is appropriate to use the binomial distribution as the underlying model. We proceed by noting that we are making this assumption, and that the conclusions are not valid if this assumption is not correct. n = 20 + 47 + 32 + 15 + 9 = 123 np = 123 (0. 5) = 61.5 nq = 123 (1 – 0. 5) = 61.5 Both are ≥ 10, so the sampling distribution of p̂ will be approximately normal. p̂ = (20 + 47)/123 = 0.54472

P( pˆ ≥ 0.54472 ) ⎛ ⎞ ⎜ ⎟ 0.54472 − 0.5 ⎟ ⎜ =P z≥ ⎜ (0.5)(0.5) ⎟ ⎜ ⎟ 123 ⎝ ⎠ = P( z ≥ 0.99 ) = 1 − 0.8389 = 0.1611

p-value = 0.1611 > α = 0.05 We fail to reject H0. There is not enough evidence to infer that more than half of customers agreed or strongly agree that the staff at the local branch can provide good advice on their financial affairs.

146

Instructor’s Solutions Manual - Chapter 7

14. H0: µ =58.2 H1: µ > 58.2 α = 0.05 We are told to assume that the test marks are normally distributed.

x = 65.4 s = 18.6 n = 20 P( x ≥ 65.4) ⎛ ⎞ ⎜ 65.4 − 58.2 ⎟ = P⎜ t ≥ ⎟ 18.6 ⎜ ⎟ 20 ⎠ ⎝ = P(t ≥ 1.731) We must consult the t-table for the row with 19 degrees of freedom. We see that 1.731 is between t.050 and t.025. So we can conclude that P(t ≥ 1.731) < 0.05. We reject H0. There is enough evidence to suggest that the average mark was higher in Mr. Wilson's class. However, this does not mean that Mr. Wilson's test was easier than Ms. Hardy's. There are many other possible explanations, such as: • Mr. Wilson is a better teacher • Mr. Wilson's students work harder • Mr. Wilson's students are smarter • Mr. Wilson's students have a better class schedule, which allows them to absorb the material better.

147

Instructor’s Solutions Manual - Chapter 7

15. This data set was examined for Chapter 6 Review Exercise 9. The sample data appeared to be approximately normally distributed.

Number of Components

Lifespans of a Random Sample of 40 Electronic Components 18 16 14 12 10 8 6 4 2 0

Lifespan in Hours

The Excel template for this data set is shown below.

Making Decisions About the Population Mean with a Single Sample Do the sample data appear to be normally distributed? Sample Standard Deviation s Sample Mean Sample Size n Hypothetical Value of Population Mean t-‐Score One-‐Tailed p-‐Value Two-‐Tailed p-‐Value

yes 234.6175952 6188.3875 40 6200 -‐0.3130366 0.3779604 0.7559208

H0: µ = 6200 H1: µ < 6200 α = 0.05 This is a one-tailed test. From the template, we see that p-value = 0.378 > α = 0.05. Fail to reject the null hypothesis. There is not enough evidence to suggest that the average lifespan of the electronic components is less than 6200 hours.

148

Instructor’s Solutions Manual - Chapter 7

16. This data set was examined in Chapter 6 Review Exercise 11. We noted that the histogram was skewed to the right, but that the sample size, at 29, was fairly large, and so we assumed the sampling distribution would be fairly normal. The Excel template for this data set is shown below. Making Decisions About the Population Mean with a Single Sample Do the sample data appear to be normally distributed? Sample Standard Deviation s Sample Mean Sample Size n Hypothetical Value of Population Mean t-‐Score One-‐Tailed p-‐Value Two-‐Tailed p-‐Value

yes 599.773 1756.48 29 2000 -‐2.1865 0.01865 0.0373

H0: µ = 2000 H1: µ < 2000 α = 0.04 This is a one-tailed test. From the template, we see that p-value = 0.0187 < α = 0.04. Reject the null hypothesis. There is enough evidence to suggest that the average daily sales at the shop are less than $2,000. However, as noted before, this could be the result of the change of ownership, and does not necessarily indicate the former owner was not being truthful. 17. H0: p = 0.25 H1: p < 0.25 α = 0.05 Sampling is done without replacement, and sample size is 2400. However, the population is all Canadians, so the sample is definitely less than 5% of the population. It is appropriate to use the binomial distribution as the underlying model. n = 2400 Use Excel's Histogram tool to organize the data. The output is shown below. The total was calculated with an Excel formula. In the data set, 0 = not interested, 1 = not sure, 2 = interested, 3 = very interested.

149

Instructor’s Solutions Manual - Chapter 7

Bin

Total

Frequency 0 1197 1 635 2 377 3 191 2400

The completed Excel template for this data set is shown below.

Making Decisions About the Population Proportion with a Single Sample Sample Size n Hypothetical Value of Population Proportion p (decimal form) np nq Are both np and nq >=10? Sample Proportion z-‐Score One-‐Tailed p-‐Value Two-‐Tailed p-‐Value

2400 0.25 600 1800 yes 0.23667 -‐1.5085 0.06571 0.13143

np = 600 nq = 1800 Both are ≥ 10, so the sampling distribution of p̂ will be approximately normal. p̂ = (377 + 191)/2400 = 0.23667

This is a one-tailed test, so the p-value = 0.06571 > α = 0.05 Fail to reject H0. There is not enough evidence to infer that fewer than one quarter of all Canadians are extremely interested or very interested in having smart meters installed in their homes.

150

Instructor’s Solutions Manual - Chapter 7

18. H0: µ = $25 H1: µ > $25 α = 0.05 A histogram of the data set is shown below. The sample data appear to be approximately normally distributed. In this case, with sample size fairly large at 50, we rely to an extent on the robustness of the t-distribution.

Survey of Drugstore Customers, Most Recent Purchase 18 16

Number of Customers

14 12 10 8 6 4 2 0 Amount of Purchase

A completed template for the problem is shown below. Making Decisions About the Population Mean with a Single Sample Do the sample data appear to be normally distributed? Sample Standard Deviation s Sample Mean Sample Size n Hypothetical Value of Population Mean t-‐Score One-‐Tailed p-‐Value Two-‐Tailed p-‐Value

yes 7.5543 26.4396 50 25 1.34751 0.09201 0.18401

This is a one-tailed test. The p-value is 0.092, which is greater than the level of significance (5%). We fail to reject the null hypothesis. There is not enough evidence to infer that the average purchase amount at the drugstore is more than $25.

151

Instructor’s Solutions Manual - Chapter 7

H0: p = 0.05 H1: p < 0.05 α = 0.04 Sampling is done without replacement. The sample size of 50 is probably not more than 5% of the total customer base of the drugstore. np = 50 (0.05) = 2.5 nq = 50 (1 – 0.05) = 47.5 Since np is not ≥ 10, the sampling distribution of p̂ will not be approximately normal. Instead, we use Excel and the binomial distribution. In the sample, 2 customers rated staff friendliness as poor. P(x ≤ 2, n = 50, p = 0.05) = 0.540533 We fail to reject H0. There is insufficient evidence to infer that fewer than 5% of the customers rate staff friendliness as poor.

152

Instructor’s Solutions Manual - Chapter 7

20. H0: µ = $45,000 H1: µ > $45,000 α = 0.04 First we examine the data. A histogram of the data set is shown below.

Number of Customers

Survey of Drugstore Customers, Annual Income 20 18 16 14 12 10 8 6 4 2 0 Annual Income

This data set is very skewed to the right, and is probably too skewed to allow the use of the t-distribution. We cannot proceed with this analysis with the tools currently at our disposal.

153

Instructor’s Solutions Manual - Chapter 7

21. H0: µ = 40 H1: µ ≠ 40 α = 0.05 First we examine the data. A histogram of the data set is shown below.

Survey of Drugstore Customers, Customer Ages

Number of Customers

30 25 20 15 10 5 0 Age

This histogram is extremely skewed to the right, and is too skewed for us to proceed. We cannot proceed with the analysis with the tools currently at our disposal.

154

Instructor’s Solutions Manual - Chapter 7

22. First, the data must be analyzed. Using Excel’s Histogram tool from Data Analysis, we discover the following. Survey of Drugstore Customers, Ratings of Speed of Service Percentage Number of of Rating Customers Customers Excellent 3 6% Good 19 38% Fair 19 38% Poor 9 18% Total 50

Sampling is done without replacement. The sample size of 50 is probably not more than 5% of the total customer base of the drugstore. The completed Excel template is shown below.

50 0.4 20 30 yes 0.44 0.57735 0.28185 0.5637

np = 50 (0.40) = 20 nq = 50 (1 – 0.40) = 30 Both are ≥ 10, so the sampling distribution of p̂ will be approximately normal. From the sample, we see that 44% of customers rated the speed of service as good or excellent.

155

Instructor’s Solutions Manual - Chapter 7

The p-value is 0.282, which is > α = 0.03. We fail to reject H0. There is insufficient evidence to infer that more than 40% of customers rate the speed of service as good or excellent. 23. As usual, we must examine the data before we proceed. A histogram is shown below.

Random Sample of City Households

Number of Families

120 100 80 60 40 20 0 After-‐Tax Incomes for Families of Two or More People

The histogram is skewed to the right. (This is often the case with income data.) The sample size is quite large, at 550. However, before we proceed with the analysis, it might be useful to think about whether there are actually two populations of data here. It could be, for instance, that there are a few high incomes in one exclusive area. Depending on the goal of the analysis, it might be useful to remove these incomes from the data set. However, this should only be done for good reason, and in a logical way. We will not proceed with the current data set, but this is a judgment call. The sample size may be large enough that the sampling distribution will be normal enough for reliable results.

156

Instructor’s Solutions Manual - Chapter 7

24. a. H0: p = 0.25 H1: p > 0.25 α = 0.04 Sampling is done without replacement. We are considering these students as a random sample of all students entering Business programs in Canadian colleges, so the sample is ≤ 5% of the population. First, organize the data. "0" means "no", and "1" means "yes.

Bin

Total

Frequency 0 371 1 139 510

The completed Excel template is shown below. Making Decisions About the Population Proportion with a Single Sample Sample Size n Hypothetical Value of Population Proportion p (decimal form) np nq Are both np and nq >=10? Sample Proportion z-‐Score One-‐Tailed p-‐Value Two-‐Tailed p-‐Value

510 0.25 127.5 382.5 yes 0.27255 1.17601 0.11979 0.23959

We see that np and nq > 10, so conditions for normality of the sampling distribution are met.

The one-tailed p-value is 0.12 > α = 0.04. There is not enough evidence to infer that over 25% of incoming students have laptops. First, analyze the data.

157

Instructor’s Solutions Manual - Chapter 7

Where Do You Live? With Parents or Relatives In Residence Rental House or Apartment Own Home Rent Room Total

Relative Frequency Frequency 164 32.16% 130 25.49% 113 22.16% 27 5.29% 76 14.90% 510

The completed Excel template is shown below.

510 0.1 51 459 yes 0.14902 3.69006 0.00011 0.00022

H0: p = 0.10 H1: p ≠ 0.10 α = 0.04 Sampling is done without replacement. We are considering these students as a random sample of all students entering Business programs in Canadian colleges, so the sample is ≤ 5% of the population. We see from the template that np and nq > 10. The two-tailed p-value is 0.00022 < α = 0.04. Reject H0. There is sufficient evidence to suggest that the percentage of incoming students who rent rooms is not 10%.

158

Instructor’s Solutions Manual - Chapter 7

First, analyze the data and create a histogram to check for normality.

Marks of Incoming Students

Number of Students

120 100 80 60 40 20 0 Mark

The histogram is fairly normal. The completed Excel template for the hypothesis test is shown below. Making Decisions About the Population Mean with a Single Sample Do the sample data appear to be normally distributed? Sample Standard Deviation s Sample Mean Sample Size n Hypothetical Value of Population Mean t-‐Score One-‐Tailed p-‐Value Two-‐Tailed p-‐Value

yes 7.69522 85.1922 510.00 80 15.2374 9.2E-‐44 1.8E-‐43

H0: µ = 80 H1: µ > 80 α = 0.04 From the template, we see that the p-value is very small. Reject H0. There is very convincing evidence that the average mark of incoming students is over 80%. Copyright © 2011 Pearson Canada Inc.

159

Instructor’s Solutions Manual - Chapter 7

First, analyze the data and create a histogram.

Number of Students

Incoming Students, Amount of Savings Available for Education 100 90 80 70 60 50 40 30 20 10 0 Amount of Savings Available

The histogram is fairly normal, and the sample size is quite large. The completed Excel template is shown below. Making Decisions About the Population Mean with a Single Sample Do the sample data appear to be normally distributed? Sample Standard Deviation s Sample Mean Sample Size n Hypothetical Value of Population Mean t-‐Score One-‐Tailed p-‐Value Two-‐Tailed p-‐Value

yes 3227.12 6427.25 510.00 6500 -‐0.5091 0.30546 0.61093

H0: µ = $6,500 H1: µ < $6,500 α = 0.04 From the template, we see that the one-tailed p-value is 0.30546. Fail to reject H0. There is not enough evidence to infer that the average amount of savings available for education is less than $6,500. Copyright © 2011 Pearson Canada Inc.

160

Instructor’s Solutions Manual - Chapter 8

Chapter 8 Solutions Develop Your Skills 8.1 1. First, check conditions. Sampling is done without replacement. We have no way to know if 300 smokers is ≤ 5% of the total number of construction workers. We proceed by noting this, and that the estimate will not be correct if there are not at least 6,000 workers in total. n p̂ = 56 n qˆ = 244 Both are  10, so we can proceed.

p̂  (critical zscore)

p̂q̂ n

 56  244     56  300  300   2.576 300 300 0.1866667  2.576(0.02249609) (0.1287, 0.2446) A 99% confidence interval estimate for the proportion of smokers is (0.1287, 0.2446). The Excel template confirms this, with slightly more accurate results.

Confidence Interval Estimate for the Population Proportion Confidence Level (decimal form) 0.99 Sample Proportion 0.18667 Sample Size n 300 np‐hat 56 nq‐hat 244 Are np‐hat and nq‐hat >=10? yes Upper Confidence Limit Lower Confidence Limit

0.24461 0.12872

161

Instructor’s Solutions Manual - Chapter 8

First, check conditions. Sampling is done without replacement. We have no way to know if 200 employees is ≤ 5% of the total number of staff. We proceed by noting this, and that the estimate will not be correct if there are not at least 4,000 workers in total. n p̂ = 142 n qˆ = 58 Both are  10, so we can proceed. p̂  (critical zscore)

p̂q̂ n

 142  58     142  200  200   1.96 200 200 0.71  1.96(0.032085822) (0.6471, 0.7729) A 95% confidence interval estimate for the proportion of employees who have children of daycare age is (0.6471, 0.7729). The Excel template confirms this, with slightly more accurate results.

Confidence Interval Estimate for the Population Proportion Confidence Level (decimal form) 0.95 Sample Proportion 0.71 Sample Size n 200 np‐hat 142 nq‐hat 58 Are np‐hat and nq‐hat >=10? yes Upper Confidence Limit Lower Confidence Limit

0.77289 0.64711

162

Instructor’s Solutions Manual - Chapter 8

First, check conditions. Sampling is done without replacement. However, the sample size of 1000 is certainly no more than 5% of all Canadians. n p̂ = 1000(0.48) = 480 n qˆ = 1000(1-0.48) = 520 Both are  10, so we can proceed.

pˆ  (critical zscore)

pˆ qˆ n

(0.48)0.52  1000 0.4  1.645 (0.015798734) (0.4540, 0.5060) 0.48  1.645

A 90% confidence interval estimate for the proportion of all Canadians who do not feel knowledgeable about such television features is (0.4540, 0.5060). The Excel template confirms this, with slightly more accurate results.

Confidence Interval Estimate for the Population Proportion Confidence Level (decimal form) Sample Proportion Sample Size n np‐hat nq‐hat Are np‐hat and nq‐hat >=10? yes Upper Confidence Limit Lower Confidence Limit

0.9 0.48 1000 480 520

0.505986605 0.454013395

163

Instructor’s Solutions Manual - Chapter 8

First, check conditions. Sampling is done without replacement. The east end of St. John's, Newfoundland, would contain thousands of households, so the sample of 50 is no more than 5% of the population. n p̂ = 37 n qˆ = 13 Both are  10, so we can proceed. pˆ  ( critical zscore)

pˆ qˆ n

(0.74)(0.26) 50 0.74  2.326(0.062032) (0.5957, 0.8843) 0.74  2.326

A 98% confidence interval estimate for the proportion of households in the east end of St. John's with high-speed internet access is (0.5957, 0.8843). The Excel template confirms this, with slightly more accurate results.

0.98 0.74 50 37 13

0.884308592 0.595691408

164

Instructor’s Solutions Manual - Chapter 8

First, record the information given. n = 1,202 p̂ = 469/1202 = 0.390183 confidence level = 19/20 = 0.95 The result is accurate to with 2.9 percentage points. This means the half-width of the confidence interval is 0.029. The 95% confidence interval will be (0. 390183 – 0.029, 0. 390183 + 0.029) (0.361, 0.419) The polling company has 95% confidence that the interval (36.1%, 41.9%) contains the percentage of British Columbia residents who think that retailers should provide biodegradable plastic bags to consumers at no charge.

Develop Your Skills 8.2 6. We are told the sample data appear approximately normal, so we will assume the population data are normal. n = 25 x = $112.36 s = $32.45 For 95% confidence, we need to identify t.025 for 24 degrees of freedom. t.025 = 2.064, from the tables  s  x  ( critical t score)   n

 32.45  112.36  ( 2.064)   25  112.36  13.39536 (98.96, 125.76) A 95% confidence interval estimate for the average grocery bill of all households who shop at this store is ($98.96, $125.76).

165

Instructor’s Solutions Manual - Chapter 8

The Excel template confirms this.

Confidence Interval Estimate for the Population Mean Do the sample data appear to be normally distributed? Confidence Level (decimal form) Sample Mean Sample Standard Deviation s Sample Size n Upper Confidence Limit Lower Confidence Limit

yes 0.95 112.36 32.45 25 125.755 98.9653

Because the sample data set is clearly non-normal and highly skewed to the right, the necessary conditions are not met, and we cannot construct a confidence interval.

We are told to assume the daycare costs are normally distributed. n = 50 x = $460 s = $65 For 98% confidence, we need to identify t.010 for 49 degrees of freedom. t.010 = 2.403 for 50 degrees of freedom, the closest we could get from the tables.  s  x  ( critical t score)   n  65  460  ( 2.403)   50  460  22.08930874 ( 437.91, 482.09) A 98% confidence interval estimate for the average monthly daycare costs for Halifax households is ($437.94, $482.09).

166

Instructor’s Solutions Manual - Chapter 8

The Excel template confirms this. The numbers are slightly different because the Excel template is more accurate.

yes 0.98 460 65 50 482.107 437.893

First, check the sample data for normality. One possible histogram is shown below.

Random Sample of Marks on a Statistics Test 6

Number of Marks

5 4 3 2 1 0 Mark (%)

Since the histogram is fairly normal, we proceed.

167

Instructor’s Solutions Manual - Chapter 8

A completed Excel template is shown below.

yes 0.95 59.8 15.5381 20 67.072 52.528

A 95% confidence interval estimate for the average grade on the statistics test is (52.5, 67.1). 10. We are told to assume the salary data are approximately are normally distributed. n = 50 x = $56,387 s = $5,435 For 90% confidence, we need to identify t.050 for 49 degrees of freedom. t.050 = 1.676 for 50 degrees of freedom, the closest we could get from the tables.  s  x  ( critical t score)   n  5435  56387  (1.676)   50  56387  1288.215619 (55098.78, 57675.22) A 90% confidence interval estimate for the average salary of university graduates with a bachelor’s degree in science, and 10 years of working experience, in the Toronto area is ($55,098.78, $57,675.22).

168

Instructor’s Solutions Manual - Chapter 8

The Excel template confirms this. The numbers are slightly different because the Excel template is more accurate.

yes 0.9 56387 5435 50 57675.64 55098.36

Develop Your Skills 8.3 11. confidence level is 99%, so z-score = 2.576 HW = 0.05 p̂ = 0.1867 Substitute these values into the formula: 2  z  score  n  pˆ qˆ    HW 

 2.576  n  (0.1867)(1  0.1867)   0.05   (0.15184311)( 2654.31)

 403.03 A sample size of 404 is required to estimate the proportion of smokers on staff to within 5%, with 99% confidence.

169

Instructor’s Solutions Manual - Chapter 8

12. Since no estimate of the proportion is available, we will use p̂ = 0.50. For 95% confidence, the z-score is 1.96. HW = 0.05 2 z  score   n  pˆ qˆ    HW   1.96  n  (0.5)(0.5)   0.05   (0.25)(1536.64)

 384.16 A sample size of 385 is necessary to estimate the proportion of employees who have children of daycare age to within 5%, with 95% confidence. For 98% confidence, the z-score is 2.326.  z  score  n  pˆ qˆ    HW 

 2.326  n  (0.5)(0.5)   0.05   (0.25)( 2164.1104)  541.03 The sample size would have to increase to 542 to estimate the proportion of employees who have children of daycare age to within 5%, with 98% confidence. 13. For 95% confidence, the z-score is 1.96. HW = $10 s = $32.45  ( z  score)( s )  n   HW  

 (1.96)(32.45)    10    40.5 A sample size of 41 is necessary to estimate the average grocery bill of households who shop at this supermarket to within $10, with 95% confidence.

170

Instructor’s Solutions Manual - Chapter 8

14. For 95% confidence, the z-score is 1.96. HW = 5 s = 15.54  ( z  score)( s )  n   HW  

 (1.96)(15.54)    5    37.1 A sample size of 38 is necessary to estimate the average mark on the stats test to within 5 marks, with 95% confidence.

15. We will use an estimate of p̂ of 0.5, since no other information is given. For 96% confidence, the z-score is 2.05. HW = 0.04  z  score  n  pˆ qˆ    HW 

 2.05  n  (0.5)(1  0.5)   0.04   (0.25)(2626.5625)  656.64

A sample size of 657 would have to be taken to estimate the percentage of Canadian internet users who visit social networking sites, to within 4%, with 96% confidence. Develop Your Skills 8.4 16. A 95% confidence interval estimate for the average grocery bill of all households who shop at this store is ($98.96, $125.76). Since this does not contain $95, we can reject the owner’s claim that the average household grocery bill is $95, with a 5% level of significance.

17. A 99% confidence interval estimate for the proportion of smokers is (0.1287, 0.2446). Since this contains 20%, we do not have sufficient evidence to reject the nurse’s claim that 20% of the staff are smokers. A 1% level of significance applies. 18. A 99% confidence interval estimate for the average weekly spending of members of the college community on morning coffees is ($3.24, $5.76). Since this does not contain $6, there is sufficient evidence to reject the manager’s claim that the average weekly spending of members of the college community on morning coffees is $6, with a 1% level of significance.

171

Instructor’s Solutions Manual - Chapter 8

19. A 95% confidence interval estimate for the average grade on the statistics test is (52.5, 67.1). Since this interval does not contain 50, there is sufficient evidence to reject your belief that the average mark on the test was 50, with a 5% level of significance. Chapter Review Exercises The solutions are shown as calculations done by hand with tables and a calculator, when summary data are provided. The use of Excel tools and templates is shown when an Excel data set is provided.

n = 400 p̂ = 0.26 First, check conditions. Sampling is done without replacement. The population is all Canadian grocery shoppers, so the sample of 400 is definitely ≤ 5% of the population. n p̂ = 400(0.26) = 104 n qˆ = 400(0.74) = 296 Both are  10, so we can proceed.

pˆ  ( critical zscore)

pˆ qˆ n

(0.26)(0.74) 400 0.26  1.645(0.021932) (0.2239, 0.2961) 0.26  1.645

A 90% confidence interval estimate for the proportion of all Canadian grocery shoppers who are trying to make healthier choices is (0.2239, 0.2961). b. pˆ  ( critical zscore)

pˆ qˆ n

0.26  2.05(0.021932) (0.2150, 0.3050) A 96% confidence interval estimate for the proportion of all Canadian grocery shoppers who are trying to make healthier choices is (0.2150, 0.3050).

172

Instructor’s Solutions Manual - Chapter 8

c. pˆ qˆ n

pˆ  ( critical zscore) 0.26  2.576(0.021932) (0.2035, 0.3165)

A 99% confidence interval estimate for the proportion of all Canadian grocery shoppers who are trying to make healthier choices is (0.2035, 0.3165). d.

As the desired level of confidence increases, the confidence interval gets wider.

confidence level is 95%, so z-score = 1.96 HW = 0.03 p̂ = 0.26 Substitute these values into the formula: 2  z  score  n  pˆ qˆ    HW  2

 1.96  n  (0.26)(.74)   0.03   821.2 A sample size of 822 is required to estimate the proportion of Canadian grocery shoppers who are trying to make healthier food choices, to within 3%, with 95% confidence.

If there was no sample information available about the proportion of Canadian grocery shoppers, we would have to use p̂ = 0.5.  z  score  n  pˆ qˆ    HW 

 1.96  n  (0.5)(.5)   0.03   1067.1

The required sample size would be 1068, which is larger, and would be more expensive than a sample 822. Having some estimate of the proportion will likely reduce the costs of sampling.

173

Instructor’s Solutions Manual - Chapter 8

There is no information about whether the sample data are normally distributed. We will proceed by assuming that they are, with the caution that our results are not reliable if in fact they are not. n = 212 x = 1576

s = 521 For 95% confidence, we need to identify t.025 for 520 degrees of freedom. t.025 = 1.972 for 200 degrees of freedom, the closest we could get from the tables.  s  x  ( critical t score)   n  521  1576  (1.972)   212  1576  1.972(35.7824) (1505, 1647) A 95% confidence interval estimate for the average number of web pages visited per month by Canadian internet users is (1505, 1647).

s = 321 For 95% confidence, we need to identify t.025 for 520 degrees of freedom. t.025 = 1.972 for 200 degrees of freedom, the closest we could get from the tables.  s  x  ( critical t score)   n  321  1576  (1.972)   212  1576  1.972( 22.0464) (1533, 16197) A 95% confidence interval estimate for the average number of web pages visited per month by Canadian internet users is (1533, 1619).

s = 201 For 95% confidence, we need to identify t.025 for 520 degrees of freedom. t.025 = 1.972 for 200 degrees of freedom, the closest we could get from the tables.  s  x  ( critical t score)   n  201  1576  (1.972)   212  1576  1.972(13.8047) (1549, 1603) A 95% confidence interval estimate for the average number of web pages visited per month by Canadian internet users is (1549, 1603).

174

Instructor’s Solutions Manual - Chapter 8

d. 4 a.

As the variability in the data decreases, the confidence intervals become narrower. They do not have to be so wide, because the distributions are not so wide. For 95% confidence, the z-score is 1.96. HW = 100 s = 521  ( z  score)( s )  n   HW  

 (1.96)(521)    100    104.3 A sample size of 105 is necessary to estimate the average number of web pages visited per month by Canadian internet users to within 100 pages, with 95% confidence.

For 95% confidence, the z-score is 1.96. HW = 50 s = 521  ( z  score)( s )  n   HW  

 (1.96)(521)    50    417.1 A sample size of 418 is necessary to estimate the average number of web pages visited per month by Canadian internet users to within 50 pages, with 95% confidence.

175

Instructor’s Solutions Manual - Chapter 8

For 95% confidence, the z-score is 1.96. HW = 10 s = 521  ( z  score)( s )  n   HW  

 (1.96)(521)    10    10427.7 A sample size of 10,428 is necessary to estimate the average number of web pages visited per month by Canadian internet users to within 10 pages, with 95% confidence.

As the desired level of accuracy increases, the sample size required increases.

We have no estimate for p̂ , so we will use p̂ = 0.5. confidence level is 90%, so z-score = 1.645 HW = 0.02 Substitute these values into the formula: 2  z  score  n  p̂q̂   HW  2

 1.645  n  (0.5)(1  0.5)   0.02   (0.25)(6765.0625)  1691.3 A sample size of 1692 is required to estimate the proportion of new graduates of a Business program who are willing to relocate to find a job. Since your college graduates only about 350 students from the Business program, this indicates that the entire population should be surveyed for the desired level of accuracy.

176

Instructor’s Solutions Manual - Chapter 8

Sampling is done without replacement, but there are millions of adults in the greater Toronto are, so we can be sure that the sample of 316 is ≤ 5% of the total population. n p̂ = 316(0.40) = 126.4 n qˆ = 316(1-0.40) = 189.6 Both are  10, so we can proceed. pˆ qˆ pˆ  ( critical zscore) n (0.4)(0.6) 316 0.4  1.645(0.027558913) (0.3547, 0.4453)

0.40  1.645

A 90% confidence interval estimate for the proportion of adults in the greater Toronto are who would keep their jobs if they won $10-million in the lottery is (0.3547, 0.4453). 7.

We are told the sample data appear normally distributed, so we will assume the population data are as well. n = 30 x = 54.2 s = 3.2 For 99% confidence, we need to identify t.005 for 29 degrees of freedom. t.005 = 2.756.  s  x  ( critical t score)   n  3.2  54.2  ( 2.756)   30  54.2  1.610158 (52.59, 55.81) A 99% confidence interval estimate for average hours of work per week for these employees is (52.59, 55.81 hours).

177

Instructor’s Solutions Manual - Chapter 8

For 95% confidence, the z-score is 1.96. HW = 1 There is no estimate of s. However, the range is (10 – 1) = 9, so we will use 9/4 as an estimate of s. s  9/4 = 2.25  (z  score)(s)  n   HW  

 (1.96)(2.25)    1    19.4 A sample size of 20 is necessary to estimate the average number of hours, per week, that students in statistics classes spend (outside class) working on statistics, to within 1 hour, with 95% confidence. The results of any such survey should be interpreted with care, because it would be impossible to get the data independent of the students themselves. It is not unlikely that students would inflate their estimates of the amount of work they do, particularly if they wanted to impress their professor favourably. 9.

confidence level is 95%, so z-score = 1.96 HW = 0.02 p̂ = 0.10 Substitute these values into the formula: 2  z  score  ˆ ˆ n  pq   HW  2

 1.96  n  (0.1)(1  0.1)   0.02   (0.09)(9604)  864.36 A sample size of 865 is required to estimate the percentage of the adult population in Canada who would consider buying a hybrid vehicle for their next purchase, to within 2%, with 95% confidence.

178

Instructor’s Solutions Manual - Chapter 8

10. confidence level is 95%, so z-score = 1.96 HW = 0.03 no estimate of p̂ is given, so we will use p̂ = 0.5 Substitute these values into the formula: 2 z  score   n  pˆ qˆ    HW  2

 1.96  n  (0.5)(1  0.5)   0.03   (0.25)(4268.444444)  1067.11 A sample size of 1068 is required to estimate the percentage of households that make consistent efforts to separate recyclable materials from their garbage, to within 3 percentage points, with 95% confidence. 11. Sampling is done without replacement. The sample size is 2450, which is quite large. If the total population is all workers, though, there would be millions, and so the sample would be ≤ 5% of the population. n p̂ = 2450(0.43) = 1053.5 n qˆ = 2450(1 – 0.43) = 1396.5 Both are  10, so we can proceed. For a 99% confidence level, the z-score is 2.576. pˆ qˆ pˆ  ( critical zscore) n (0.43)(0.57) 2450 0.43  2.576(0.010002041) (0.4042, 0.4558) 0.43  2.576

A 99% confidence interval estimate for the proportion of workers who phone in sick when they are not ill is (0.4042, 0.4558). It is hard to assess the reliability of these results. Would people tell the truth when they were asked such a question? There are a number of reasons why they might not.

179

Instructor’s Solutions Manual - Chapter 8

12. We are told the sample data appear approximately normal, so we will assume the population data are normal. n = 40 x = $68.52 s = $14.89 For 98% confidence, we need to identify t.010 for 39 degrees of freedom. There is no row in the t-table for 39 degrees of freedom. We will use the row for 40 degrees of freedom, and we see that t.010 = 2.423.  s   x  (critical t score)  n  14.89   68.52  (2.423)  40  68.52  5.704506985 ($62.82, $74.22) A 98% confidence interval estimate for of the average amount spent (per person) in this restaurant by diners with business expense accounts is ($62.82, $74.22).

13. We are told the sample data appear approximately normal, so we will assume the population data are normal. n = 40 x = $543.21 s = $47.89 For 95% confidence, we need to identify t.025 for 39 degrees of freedom. Since there is no row in the table for 39 degrees of freedom, we will use the row for 40 degrees of freedom. We see that t.025 = 2.021.  s   x  (critical t score)  n  47.89   543.21  (2.021)  40  543.21  15.30316127 ($527.91, $558.51) A 95% confidence interval estimate for the average monthly rent for students at this college is ($527.91, $558.51). Because this interval does not contain $500, there is sufficient evidence to reject the claim that the average monthly rent is $500, with 5% significance.

180

Instructor’s Solutions Manual - Chapter 8

14. confidence level is 98%, so z-score = 2.326 HW = 0.03 p̂ = 0.35 Substitute these values into the formula: 2  z  score  n  p̂q̂   HW   2.326  n  (0.35)(1  0.35)   0.03   (0.2275)(6011.417778)

 1367.6 A sample size of 1368 is required estimate the proportion of this college’s students who live at home with their parents. 15. First we have to organize the data. We can use Excel’s Histogram tool to organize the data, and then produce the following table.

Customer Survey for an Ice Cream Store Which flavour would you like to try, if any? Response Pecan and Fudge Apple Pie Banana Caramel Ripple Ginger and Honey Would Not Try Any Of These Flavours Total

Frequency 37 16 29 32 36 150

Relative Frequency 24.7% 10.7% 19.3% 21.3% 24.0%

From this we see p̂ = 0.24. Sampling is done without replacement. We do not know the total number of customers at the ice cream store. As long as the sample of 150 is not more than 5% of this total number, we can proceed.

181

Instructor’s Solutions Manual - Chapter 8

Confidence Interval Estimate for the Population Proportion Confidence Level (decimal form) Sample Proportion Sample Size n np‐hat nq‐hat Are np‐hat and nq‐hat >=10? Upper Confidence Limit Lower Confidence Limit

0.95 0.1933 150 29 121

0.256531268 0.130135399

A 95% confidence interval estimate for the percentage of customers who would not try any of the new flavours is (0.1301, 0.2565).

182

Instructor’s Solutions Manual - Chapter 8

16. A histogram of the sample data appears approximately normal.

Quantity of Toothpaste in a Random Sample of 30 Tubes 12

Numer of Tubes

10 8 6 4 2 0 mL

yes 0.95 129.5 2.3165 30 130.372 128.642

A 95% confidence interval for the average amount of toothpaste in the tubes is (128.64, 130.37).

183

Instructor’s Solutions Manual - Chapter 8

17. The data appear to be normally distributed.

No. of Customers Renting a Car in the 8 a.m.‐ 10 a.m. Period at a Car Rental Agency 18 16

Number of Days

14 12 10 8 6 4 2 0

Number of Customers

yes 0.99 21.48 3.56422 50 22.8308 20.1292

A 99% confidence interval estimate for the average number of customers renting a car in the 8 a.m.-10 a.m. period at this car rental agency is (20.13, 22.83). Of course, it is not possible for there to be 20.13 customers. This number arises because the underlying data are not continuous (they are counts). However, it is not unusual to create a confidence interval estimate for such data sets. Realistically, the confidence interval would be (20, 23).

184

Instructor’s Solutions Manual - Chapter 8

18. The sample data are approximately normally distributed, although somewhat skewed to the right. In this case, with a fairly large sample size of 60, we will rely on the robustness of the t-distribution and proceed.

Number of Cars

Annual Car Maintenance Costs, Third Year of Life of Entry‐level Compact 20 18 16 14 12 10 8 6 4 2 0

Annual Maintenance Costs

A completed Excel template for the data set is shown below.

yes 0.98 138.556 69.8614 60 160.123 116.990

A 98% confidence interval estimate for the annual maintenance costs of this entrylevel compact in the 3rd year of its life is ($116.99, $160.12).

185

Instructor’s Solutions Manual - Chapter 8

19. For 98% confidence, the z-score is 2.326. HW = $10 s = 69.861423  (z  score)(s)  n   HW  

 (2.326)(69.861423)    10    264.05 A sample size of 265 is necessary to estimate the annual maintenance costs of this entry-level compact in the 3rd year of its life to within $10, with 98% confidence. 20. First we must analyze the data. We can use Excel’s Histogram tool to organize the data, and then create the table shown below.

Age of A/R 0-30 Days Old 31-60 Days Old More Than 60 Days Old Total

Frequency 47 41 12 100

Relative Frequency 0.47 0.41 0.12

From the table we see p̂ = 0.47. We are sampling without replacement. The company is described as “large”. We must assume that the sample of 100 is not more than 5% of the population, and note that our analysis might not be correct otherwise. n p̂ = 47 n qˆ = 53 Both are  10, so we can proceed. A completed Excel template is shown below.

186

Instructor’s Solutions Manual - Chapter 8

0.95 0.47 100 47 53

0.567821643 0.372178357

A 95% confidence interval estimate for the proportion of accounts receivable that are 0-30 days old is (0.3722, 0.5678). 21. confidence level is 95%, so z-score = 1.96 HW = 0.05 p̂ = 0.47 Substitute these values into the formula: 2  z  score  n  p̂q̂   HW  2

 1.96  n  (0.47)(1  0.47)   0.05   (0.2491)(1536.64)  382.77 A sample size of 383 is required to estimate the proportion of accounts receivable that are 0-30 days old, to within 5%, with 95% confidence.

187

Instructor’s Solutions Manual - Chapter 8

22. We have already examined this data set Chapter 7 Review Exercise 24. a.

Sampling is done without replacement. We are considering these students as a random sample of all students entering Business programs in Canadian colleges, so the sample is ≤ 5% of the population. First, organize the data. "0" means "no", and "1" means "yes.

Bin 0 1 Total

Frequency 371 139 510

The completed Excel template is shown below.

Confidence Interval Estimate for the Population Proportion Confidence Level (decimal form)

0.96

Sample Proportion Sample Size n np‐hat nq‐hat Are np‐hat and nq‐hat >=10?

0.27255 510 139 371 yes

Upper Confidence Limit Lower Confidence Limit

0.31304 0.23206

We see that np and nq > 10, so conditions for normality of the sampling distribution are met. A 96% confidence interval estimate of the proportion of incoming students who have laptops is (0.23, 0.31).

188

Instructor’s Solutions Manual - Chapter 8

confidence level is 96%, so z-score = 2.05 HW = 0.02 p̂ = 0.27255 Substitute these values into the formula: 2 z  score   n  pˆ qˆ    HW  2

 2.05  n  (0.27255)(1  0.27255)   0.02   (0.198266)(10506.25)  2083.03 A sample size of 2084 is required estimate the proportion of incoming students who have laptops to within 2%, with 96% confidence. c.

First, analyze the data.

Where Do You Live? With Parents or Relatives In Residence Rental House or Apartment Own Home Rent Room Total

Relative Frequency Frequency 164 32.16% 130 25.49% 113 22.16% 27 5.29% 76 14.90% 510

Sampling is done without replacement. We are considering these students as a random sample of all students entering Business programs in Canadian colleges, so the sample is ≤ 5% of the population.

189

Instructor’s Solutions Manual - Chapter 8

The completed Excel template is shown below.

Confidence Interval Estimate for the Population Proportion Confidence Level (decimal form) Sample Proportion Sample Size n np‐hat

0.96 0.149020 510 76

nq‐hat Are np‐hat and nq‐hat >=10?

434 yes

Upper Confidence Limit Lower Confidence Limit

0.18140 0.11663

We see from the template that np and nq > 10. A 96% confidence interval estimate of the proportion of incoming students who rent rooms is (0.1166, 0.1814).

190

Instructor’s Solutions Manual - Chapter 8

First, analyze the data and create a histogram to check for normality.

Marks of Incoming Students

Number of Students

120 100 80 60 40 20 0

Mark

The histogram is fairly normal. The completed Excel template for the confidence interval is shown below.

yes 0.96 85.1922 7.69522 510.00 85.8938 84.4905

A 96% confidence interval estimate of the average mark of incoming students is (84.5, 85.9).

191

Instructor’s Solutions Manual - Chapter 8

First, analyze the data and create a histogram.

Number of Students

Incoming Students, Amount of Savings Available for Education 100 90 80 70 60 50 40 30 20 10 0

Amount of Savings Available

The histogram is fairly normal, and the sample size is quite large. The completed Excel template is shown below.

yes 0.96 6427.25 3227.12 510.00 6721.49 6133.02

A 96% confidence interval estimate of the amount of savings available for education among incoming students is ($6,133.02, $6,721.49).

192

Instructor’s Solutions Manual - Chapter 8

For 96% confidence, the z-score is 2.05. HW = $200 s = 3227.1172  ( z  score)( s )  n   HW  

 ( 2.05)(3227.1172)    200    1094.15 A sample size of 1,095 is necessary to estimate the amount of savings available for education to within $200, with 96% confidence.

193

Instructor’s Solutions Manual - Chapter 9

Chapter 9 Solutions Develop Your Skills 9.1 1. First, calculate differences.

Worker 1 2 3 4 5 6 7 8 9 10

Average Average Daily Daily Production Production Before After Music Music Difference 18 18 0 14 15 -1 10 12 -2 11 15 -4 9 7 2 10 11 -1 9 6 3 11 14 -3 10 11 -1 12 12 0

Differences appear normally distributed, although as usual, this can be hard to determine with small sample sizes.

Production Before and After Music is Played in the Plant Number of Workers

5 4 3 2 1 0 (Average Daily Production Before Music) -‐ (Average Daily Production After Music)

194

Instructor’s Solutions Manual - Chapter 9

H0 : µD = 0 H1 : µD < 0 (The order of subtraction is production before the music – production after the music is played. If music increases productivity, the production before the music should be lower than after the music is played, so the average difference would be negative.) α = 0.04 Calculations can be done by hand, with Excel functions and the template, or with the Data Analysis tool. A completed template is shown below (mean, standard deviation, and sample size were computed with Excel functions).

yes 2.11082 -‐0.7 10 0 -‐1.0487 0.16083 0.32166

This is a one-tailed test, so the p-value is 0.16. Since this is > 4%, we fail to reject H0. There is insufficient evidence to suggest that playing classical music led to increased worker productivity.

195

Instructor’s Solutions Manual - Chapter 9

First calculate the differences. Weekly Sales Before and After Product Redesign Store sales after sales before 51 Bayfield $ 842.42 $ 813.67 109 Mapleview Drive $ 831.54 $ 698.71 137 Wellington $ 822.86 $ 734.48 6 Collier $ 876.97 $ 832.46 421 Essa Road $ 776.44 $ 791.22 19 Queen $ 793.19 $ 766.73 345 Cundles $ 730.17 $ 668.66 D-564 Byrne Drive $ 576.95 $ 631.05 24 Archer $ 758.87 $ 724.39 15 Short St. $ 736.04 $ 766.76

differences $ 28.75 $ 132.83 $ 88.38 $ 44.51 -$ 14.78 $ 26.46 $ 61.51 -$ 54.10 $ 34.48 -$ 30.72

The differences appear to be normally distributed, as seen in the histogram below.

Sales for Gourmet Cookies, Before and After Packaging Redesign 5

Number of Stores

4 3 2

1 0 (Sales After Packaging Redesign) -‐ (Sales Before Packaging Redesign)

H0 : µD = 0 H1 : µD > 0 (The order of subtraction is (sales after packaging redesign – sales before packaging redesign). If the package redesign leads to increased sales, sales should be higher after the design, so the differences would tend to be positive.) α = 0.05

196

Instructor’s Solutions Manual - Chapter 9

Calculations can be done by hand, with Excel functions and the template, or with the Data Analysis tool. Output from the Data Analysis t-test: paired two-sample for means is shown below. t-‐Test: Paired Two Sample for Means

Mean Variance Observations Pearson Correlation Hypothesized Mean Difference df t Stat P(T<=t) one-‐tail t Critical one-‐tail P(T<=t) two-‐tail t Critical two-‐tail

Sales After Sales Before Packaging Packaging Redesign Redesign 774.545 742.813 7085.906428 4098.8416 10 10 0.749516219 0 9 1.800489248 0.052654545 1.833113856 0.10530909 2.262158887

This is a one-tailed test, so the appropriate p-value is 0.053. This is > 5%, so we fail to reject H0. There is insufficient evidence to suggest that sales of the cookies increased after the packaging was redesigned. The completed Excel template for the confidence interval is shown below. Confidence Interval Estimate for the Population Mean Do the sample data appear to be normally distributed? Confidence Level (decimal form) Sample Mean Sample Standard Deviation s Sample Size n Upper Confidence Limit Lower Confidence Limit

yes 0.9 $ 31.73 55.7323 10 64.039 -‐0.575

A 90% confidence interval estimate for the difference in sales after the product redesign is (-0.58, $64.03).

197

Instructor’s Solutions Manual - Chapter 9

3. We are not given the data set, but some summary data. We cannot check for normality of differences. We proceed by assuming the differences are normally distributed, and noting that our conclusions may not be valid if this is not the case. H0 : µD = 0 H1 : µD ≠ 0 (The order of subtraction is before – after. The alternative hypothesis concerns a difference in daily sales before and after the script change, either positive or negative.) α = 0.05 We are given: x D = 4.2 sD = 23.4 nD = 56 t=

xD −µD 4 .2 − 0 = = 1.343 sD 23.4 56 nD

We refer to the t-distribution with 55 degrees of freedom. There is no such row in the t-table, but whether we choose the row with 50 or 60 degrees of freedom, the calculated t-score of 1.343 is between t.100 and t.050. This is a two-tailed test, so 2 • 0.050 < p-value < 2 • 0.100 0.1 < p-value < 0.2 Since the p-value > 5%, we fail to reject H0. There is insufficient evidence to suggest there is a difference in daily sales by the telemarketers before and after the script change.

198

Instructor’s Solutions Manual - Chapter 9

This is a large data set, so we will use Excel. Differences appear normally-distributed. See the histogram below.

Number of Hours Studied Over a Four-‐Week Period 30

Frequency

25 20 15 10 5 0

(Hours Studied by Male Students) -‐ (Hours Studied by Female Students)

H0 : µD = 0 H1 : µD < 0 (The order of subtraction is hours studied by male students – hours studied by female students. If female students study more, these differences will tend to be negative.) α = 0.02 Output from the Data Analysis t-test: paired two-sample for means is shown below. t-‐Test: Paired Two Sample for Means

Mean Variance Observations Pearson Correlation Hypothesized Mean Difference df t Stat P(T<=t) one-‐tail t Critical one-‐tail P(T<=t) two-‐tail t Critical two-‐tail

Males Females 112.76 129.02 2307.497 1774.323 100 100 -‐0.02797 0 99 -‐2.51047 0.006839 1.660392 0.013677 1.984217

This is a one-tailed test, so the p-value is 0.007. This is less than α, so we reject H0. There is sufficient evidence to suggest that female students study more than male students. However, results must be interpreted with caution, as there are many factors that affect how much a student studies.

199

Instructor’s Solutions Manual - Chapter 9

The completed Excel template for the 96% confidence interval estimate is shown below. Confidence Interval Estimate for the Population Mean Do the sample data appear to be normally distributed? Confidence Level (decimal form) Sample Mean Sample Standard Deviation s Sample Size n

yes

Upper Confidence Limit Lower Confidence Limit

0.96 -‐16.26 64.7688 100 -‐2.7806 -‐29.739

The 96% confidence interval estimate of the average number of hours that male students study, compared to than their female counterparts over a four-week period is (-29.7, -2.8). 5.

We are told the histogram of differences appears to be normally distributed. We are given summary data. H0 : µD = 0 H1 : µD > 0 (The order of subtraction is fuel consumption without checking tires – fuel consumption checking tires. If checking the tires improves fuel consumption (that is, reduces it), then these differences would tend to be positive. α = 0.04 We could do question by hand or with the Excel template.

200

Instructor’s Solutions Manual - Chapter 9

yes 1.4 0.4 20 1.27775 0.10836 0.21673

This is a one-tailed test, so the p-value is 0.11. This is > 0.04, so we fail to reject H0. There is insufficient evidence to support the association’s claim that checking tire pressure regularly improves fuel consumption. However, there may be other explanatory factors at play. Although fuel consumption was recorded during two summer months, driving behavior and weather could have been quite different in the two months, and so we cannot consider this test to be definitive.

201

Instructor’s Solutions Manual - Chapter 9

Develop Your Skills 9.2 6. This is a large data set, so we will use Excel. First check the histogram of differences. The differences appear approximately normally distributed. [Note that you should not be fooled into thinking that a WSRST will be required for these questions, just because they came right after the discussion of the WSRST in the text. Which test to use depends on the conditions.]

Weekly Worker Errors Before and After a Training Program 35

Number of Workers

30 25 20

15 10 5 0 (Weekly Worker Errors Before Training) -‐ (Weekly Worker Errors Afer Training)

We can use the t-test for matched pairs. H0 : µD = 0 H1 : µD > 0 (The order of subtraction is errors before training – errors after training. If the training reduced the number of errors, these differences would tend to be positive.) α = 0.04

202

Instructor’s Solutions Manual - Chapter 9

t-Test: Paired Two Sample for Means Weekly Weekly Errors Before Errors After Training Training Mean Variance Observations Pearson Correlation Hypothesized Mean Difference df t Stat P(T<=t) one-tail t Critical one-tail P(T<=t) two-tail t Critical two-tail

16.64 16.04 7.081212121 3.23070707 100 100 0.140311038 0 99 2.00337553 0.023935075 1.660391157 0.047870149 1.9842169

This is a one-tailed test, so the p-value is 0.024. We reject H0. There is enough evidence to suggest that weekly worker errors declined after the training.

203

Instructor’s Solutions Manual - Chapter 9

We have seen a similar problem in Develop Your Skills 9.1 Exercise 2, but the data set has changed. Now, the differences are non-normal.

Sales for Gourmet Cookies, Before and After Packaging Redesign 5

Number of Stores

4 3 2

1 0 (Sales After Packaging Redesign) -‐ (Sales Before Packaging Redesign)

The sample size is small. The histogram is not perfectly symmetric, but it does show a somewhat symmetric U-shape, so we will proceed with the WSRST. H0 : H1 :

populations of weekly sales of gourmet cookies before and after the packaging redesign are the same population of weekly sales of gourmet cookies after the packaging redesign is to the right of the population of weekly sales before the packaging redesign (that is, weekly sales of gourmet cookies are generally greater after the packaging redesign, compared to before the redesign)

α = 0.05 Now we must rank the differences (their absolute values) and compute W+ and W-.

204

Instructor’s Solutions Manual - Chapter 9

The table below summarizes.

Absolute Ordered Differences Value Of Differences Differences 128.33 -132.83 -88.38 144.51 -114.78 26.46 -61.51 154.1 134.48 -130.72

128.33 132.83 88.38 144.51 114.78 26.46 61.51 154.1 134.48 130.72

26.46 61.51 88.38 114.78 128.33 130.72 132.83 134.48 144.51 154.1 sums

Ranks To Be Assigned 1 2 3 4 5 6 7 8 9 10 55

Ranks For Positive Differences

Ranks For Negative Differences 1 2 3 4 5 6 7

8 9 10 + W = 33

W- = 22

The order of subtraction is sales after the packaging redesign – sales before the packaging redesign. If the packaging redesign increased sales, these differences would tend to be positive. Many positive differences would lead to a high rank sum for the positive differences. So, p-value = P(W+ ≥ 33). Since the sample size is small, we turn to the WSRST table, for nW = 10. We see P(W+ ≥ 44) = 0.053, so we know P(W+ ≥ 33) > 0.053. We fail to reject H0. There is insufficient evidence to suggest that sales of the gourmet cookies increased after the packaging was redesigned.

205

Instructor’s Solutions Manual - Chapter 9

We have seen a similar problem in Develop Your Skills 9.1, Exercise 3, but the data set has changed. Now, the differences are non-normal. This is such a small data set that a histogram is not all that useful. We must be cautious making any conclusions about the locations of the before and after populations.

Daily Sales by Telemarketers 4

Frequency

3 2 1 0 (Number of Daily Sales Before Script Changed) -‐ (Number of Daily Sales After Script Changed)

H0 :

population of sales of telemarketers before and after the script is changed are the same H1: population of sales of telemarketers before the script is changed is either to the right or the left of the different population of sales of telemarketers after the script is changed α = 0.05

Differences -11 -10 -10 -9 -1 1 2 7 2

Absolute Value Of Ordered Differences Differences 11 1 10 1 10 2 9 2 1 7 1 9 2 10 7 10 2 11 sums

Ranks To Be Assigned 1 2 3 4 5 6 7 8 9 45

Ranks For Positive Differences

Ranks For Negative Differences 1.5

1.5 3.5 3.5 5

13.5

6 7.5 7.5 9 31.5

Now we turn to the table for nW = 9. We see that P(W+ ≥ 31.5) > 0.064. This is a two-tailed test, so the p-value > 2 • 0.064 = 0.128. We fail to reject H0. There is insufficient evidence to suggest a difference in the locations of the populations of sales of the telemarketers before and after the script is changed. Copyright © 2011 Pearson Canada Inc.

206

Instructor’s Solutions Manual - Chapter 9

We have seen a similar problem in Develop Your Skills 9.1 Exercise 4, but the data set has changed. This is a large data set, so we will use Excel. A histogram of differences is shown below.

Number of Hours Studied Over a Four-‐Week Period

Number of Student Pairs

30 25 20 15 10 5 0 (Hours Studied by Male Students) -‐ (Hours Studied by Female Students)

This histogram is fairly symmetric, and almost normal-looking. The problem is that there is an outlier, which significantly affects the mean in this data set. This outlier may be an error (or a lie). It occurs with an observation of a male student who claims to have studied 437 hours over the 4-week period. Since this amounts to about 15 ½ hours of studying per day, it is suspicious. However, we have no way to check the accuracy of the data. H0 : H1 :

populations of hours of study for male and female students are the same populations of hours of study for male students is to the left of (below) the population of hours of study for female students α = 0.04 We will use the Excel add-in Wilcoxon Signed Rank Sum Test Calculations to get W+ and W- for this data set. The output is as follows.

Wilcoxon Signed Rank Sum Test Calculations sample size 100 W+ 2189 W2861

207

Instructor’s Solutions Manual - Chapter 9

Since sample size is large, at 100, we can use the Excel template based on the normal approximation to the sampling distribution for this test. Making Decisions About Matched Pairs, Quantitative Data, Non-‐Normal Differences (WSRST) Sample Size 100 Is the sample size at least 25? yes Is the histogram of differences symmetric? yes W+ 2189 W-‐ 2861 z-‐Score 1.15527715 One-‐Tailed p-‐Value 0.12398848 Two-‐Tailed p-‐Value 0.24797695

This is a one-tailed test, so the p-value is 0.124. We fail to reject H0. There is insufficient evidence to suggest that male students study less than female students.

208

Instructor’s Solutions Manual - Chapter 9

10. We have seen a similar problem in Develop Your Skills 9.1 Exercise 5, but the data set has changed. The histogram of differences is skewed to the left, and the sample size is fairly small, so we will use the WSRST for this analysis.

Frequency

L/100 km for Cars During Summer Months 10 9 8 7 6 5 4 3 2 1 0 (L/100 km Without Checking Tire Pressure Reguarly) -‐ (L/100 km When Checking Tire Pressure Regularly)

H0 :

populations of L/100 km fuel consumption for cars with and without tire pressures checked regularly are the same H1: populations of L/100 km fuel consumption for cars without tire pressures checked regularly is to the right of the population of L/100 km fuel consumption for cars with tire pressures checked regularly (that is, fuel consumption is higher for cars without tire pressures checked regularly) α = 0.04 We will use Excel for the calculations.

Wilcoxon Signed Rank Sum Test Calculations sample size 20 W+ 160.5 W49.5

Since sample size is < 25, we will use the WSRST tables.

209

Instructor’s Solutions Manual - Chapter 9

The order of subtraction is fuel consumption without checking tires – fuel consumption checking tires. If checking tires regularly improves (reduces) fuel consumption, then these differences would tend to be positive. So, we can focus on W+ for the p-value. 0.01 < P(W+ ≥ 160.5) < 0.024 (from the table) This is a one-tailed test. We reject H0. There is sufficient evidence to suggest that populations of L/100 km fuel consumption for cars without tire pressures checked regularly is to the right of the population of L/100 km fuel consumption for cars with tire pressures checked regularly. However, as noted before, although this provides some evidence that checking tires regularly reduces fuel consumption, there may be other explanatory factors at play. Although fuel consumption was recorded during two summer months, driving behavior and weather could have been quite different in the two months, and so we cannot consider this test to be definitive. Develop Your Skills 9.3 11. H0: p = 0.5 (half the cola drinkers prefer Cola A, half prefer the other brand) H1: p ≠ 0.5 (there is a difference in preferences for Cola A and the other brand) α = 0.05 nST = 16 -1 = 15 n+ = 9, so n- = 6 P(n+ ≥ 9, n = 15, p = 0.5) = 1 – P(n+ ≤ 8) = 1 - 0.696 = 0.304 Since this is a two-tailed test, the p-value = 2 • 0.304 = 0.608. We fail to reject H0. There is insufficient evidence to suggest there is a difference in preferences for Cola A and the other brand. 12. First, assess the differences in the ratings. Shopper 1 2 3 4 5 6 7 8 9

Rating for Rating for Difference Ford Dealer Honda Dealer (Ford – Honda) 5 1 + 2 3 3 4 4 2 + 2 3 2 2 0 5 1 + 3 2 + 3 3 0

210

Instructor’s Solutions Manual - Chapter 9

H0: p = 0.5 (the ratings for the car shopping experience are the same at the Ford and Honda dealers) H1: p ≠ 0.5 (there is a difference in ratings for the car shopping experience at the Ford and Honda dealers) α = 0.04 nST = 9 (differences) – 2 (differences of zero) = 7 n+ = 4, so n- = 3 Because this is a two-tailed test, we can focus on either n+ or n-. It is easier to use the lower numbers, with the tables. P(n- ≤ 3, nST = 7, p = 0.5) = 0.5 This is a two-tailed test, so p-value = 2 • 0.5 = 1.0. We fail to reject H0. There is insufficient evidence to suggest that there is a difference in the ratings for the car shopping experience at the Ford and Honda dealers. 13. First, analyze the differences in the ratings. Analyst 1 2 3 4 5 6 7 8 9 10 11

Rating for Rating for Differences North America Europe 3 4 2 3 4 2 + 3 2 + 3 1 + 2 3 3 2 + 3 2 + 3 4 2 1 + 4 4 0

211

Instructor’s Solutions Manual - Chapter 9

14. H0: p = 0.5 (wine-drinkers rate Californian and French wines the same) H1: p > 0.5 (wine-drinkers rate Californian wines higher than French wines, where p is the proportion of wine-drinkers who prefer Californian wines) α = 0.03 Use the Excel template. Making Decisions About Matched Pairs, Ranked Data (Sign Test) Number of Non-‐Zero Differences Number of Positive Differences Number of Negative Differences One-‐Tailed p-‐Value Two-‐Tailed p-‐Value

225 150 75 3.2E-‐07 6.4E-‐07

This is a one-tailed test, so the p-value is 3.2 • 10-7, which is very small. It would be almost impossible to get sample results like this, if wine-drinkers rated Californian and French wines the same. We reject H0. There is sufficient evidence to suggest that wine-drinkers rate Californian wines higher than French wines. This conclusion presumes that wine-drinkers who attend wine and cheese shows are representative of all wine drinkers, which may not be the case. 15. H0: p = 0.5 (potential customers are equally ready to buy an HDTV before and after seeing an ad about HDTV’s) H1: p > 0.5 (potential customers are more ready to buy an HDTV after seeing an ad about HDTV’s; p is the proportion of potential customers more likely to buy an HDTV after seeing the ad) α = 0.05 First we use the Non Parametric Tool, and the Sign Test Calculations, to analyze the data. Sign Test Calculations # of non-zero differences # of positive differences # of negative differences

132 47 85

The order of subtraction is willingness to buy before the ad – willingness to buy after the ad. A higher number indicates a greater willingness to buy, so if the ad increases willingness to buy, this should result in more minus signs. We see that there are 85 negative differences, which indicates that 85 of 132 customers increased their willingness to buy after seeing the ad.

212

Instructor’s Solutions Manual - Chapter 9

Making Decisions About Matched Pairs, Ranked Data (Sign Test) Number of Non-‐Zero Differences Number of Positive Differences Number of Negative Differences One-‐Tailed p-‐Value Two-‐Tailed p-‐Value

132 47 85 0.0006 0.0012

From the template, we see the one-tailed p-value is 0.0006, which is quite small. We reject H0. There is sufficient evidence to infer that willingness to buy an HDTV increased after potential customers saw the ad. We can also do this question by hand. Sampling is done without replacement. There are 132 customers in the sample, presumably less than 5% of all potential customers for HDTV’s. nST = 132 > 20, so the sampling distribution of p̂ will be approximately normal.

⎛ 85 ⎞ ⎜ ⎟ − 0.5 pˆ − p pˆ − p ⎝ 132 ⎠ z= = = = 3.31 σ pˆ pq (0.5)(0.5) n ST 132 p-value = P(z ≥ 3.31) = 1 – 0.9995 = 0.0005 Since the p-value is < α, we reject H0. There is sufficient evidence to suggest that potential customers are more ready to buy an HDTV after seeing an ad about HDTV’s. Notice that the p-values with the template (based on the binomial distribution) and the by-hand approximation are almost equal here. Chapter Review Exercises 1. Matched-pairs samples are better than independent samples for exploring cause and effect, because they control some of the potential causal variables, and therefore take them out of the picture. If we match Business grads according to age, experience, location, and academic performance, then we know that any difference in salary is not caused by these factors. 2.

These cannot be matched pairs, because the sample sizes are different. You can be sure that the samples are independent if the sample sizes are different.

The two different approaches will usually, but not always, lead to the same conclusion. It is harder to reject the null hypothesis with the Wilcoxon Signed Rank

213

Instructor’s Solutions Manual - Chapter 9

Sum Test, and this is why the t-test of µD is preferred, if the necessary conditions are met. Remember, the Wilcoxon Signed Rank Sum Test works with the ranks of the values, not the actual values, and so it gives up some of the information available in the sample data. 4.

The computer-based version of the Sign Test is based on the binomial distribution. The version using the sampling distribution of p̂ is an approximation. While the approximation can be quite good, the actual value provided by the binomial distribution is more accurate.

Tom is right. He has just expressed your conclusion in a different way. If the new version is rated more highly than the old version, we can also say that the old version ratings are lower than the new version ratings. This exercise is a reminder that you should read and think carefully about these comparisons. Don't get mixed up in language.

H0: ratings for two beer recipes are the same H1: ratings for two beer recipes are different α = 5% First analyze the data. Taste Test of Beer Tester Beer Recipe #3 Beer Recipe #4 Difference 1 1 4 2 3 2 + 3 2 1 + 4 5 3 + 5 3 4 6 2 1 + 7 4 5 8 1 3 9 2 1 + 10 3 4 We see nST = 10, n+ = 5, n- = 5. At this point, we can clearly see that we have no evidence of a difference, because the number of positive differences exactly matches the number of negative differences. Fail to reject H0. There is insufficient evidence of a difference in ratings for the two beer recipes.

H0: p = 0.5 (students rate the two designs the same) H1: p ≠ 0.5 (students rate the two designs differently) α = 0.025

214

Instructor’s Solutions Manual - Chapter 9

Sampling is done without replacement. We do not know the total number of students at the college. As long as there are 8,000 or more, the sample of 400 will be less than about 5% of the population, and we can use the binomial distribution. nST = 400 – 27 = 373 > 20, so the sampling distribution of p̂ will be approximately normal ⎛ 207 ⎞ ⎜ ⎟ − 0.5 pˆ − p pˆ − p ⎝ 373 ⎠ z= = = = 2.12 σ pˆ pq (0.5)(0.5) n ST 373

p-value = 2 • P(z ≥ 2.162) = 2 • (1- 0.9830) = 2 • (0.0170 = 0.034 > α Fail to reject H0. There is not enough evidence to suggest that the students rate the two designs differently. 8.

H0 : µD = 0 H1 : µD > 0 (The order of subtraction is (time without tool – time with tool). If the tool speeds work up, times should be longer without the tool, and the differences will be positive.) α = 0.05 We are told to assume the differences are normally distributed. We are given: x D = 3.4 minutes sD = 4.6 nD = 18

x D − µ D 3.4 − 0 = = 3.136 sD 4.6 18 nD

Referring to the t-table, we look at the row for 14 degrees of freedom. The t-score of 3.136 is to the right of t.005. So, p-value < 0.05. Reject H0. There is sufficient evidence to suggest that the tool speeds up the work, assuming all other explanatory factors are the same.

215

Instructor’s Solutions Manual - Chapter 9

H0 : µD = 0 H1 : µD > 0 (The order of subtraction is (price for job in wealthy neighbourhood – price for job in run-down neighbourhood). If the contractors charge more in the wealthier neighbourhoods, the differences will be positive.) α = 0.05 We are told to assume the differences are normally distributed. We are given: x D = 1262 sD = 478 nD = 10 The Excel template is shown below.

yes 478 1262 10 0 8.34894 7.9E-‐06 1.6E-‐05

The p-value is very small. Reject H0. There is sufficient evidence to suggest that the contractors charge higher prices in wealthier neighbourhoods.

216

Instructor’s Solutions Manual - Chapter 9

10. The Excel template is shown below. Of course, this could also be done without Excel. Confidence Interval Estimate for the Population Mean Do the sample data appear to be normally distributed? Confidence Level (decimal form) Sample Mean Sample Standard Deviation s Sample Size n Upper Confidence Limit Lower Confidence Limit

yes 0.95 1262 478 10 1603.94 920.059

We have 95% confidence that the interval ($920, $1,604) contains the true premium that contractors charge on a bathroom renovation in a wealthy neighbourhood. 11. H0: p = 0.5 (diners rate the two salads the same) H1: p > 0.5 (diners rate the mixed green salad higher, where p is defined as the proportion of diners who prefer the mixed green salad) α = 0.03 Sampling is done without replacement. We do not know the total number of diners at the restaurant. As long as there are 700 or more, the sample of 35 will be less than about 5% of the population, and we can use the binomial distribution. nST = 35 -3 = 32 We could use the sampling distribution of p̂ here, but the approximation will not be that good, because the sample size is fairly small. Instead we will use the Excel template. Making Decisions About Matched Pairs, Ranked Data (Sign Test) Number of Non-‐Zero Differences Number of Positive Differences Number of Negative Differences One-‐Tailed p-‐Value Two-‐Tailed p-‐Value

32 20 12 0.10766 0.21533

This is a one-tailed test. The p-value is 0.108. Fail to reject H0. There is insufficient evidence to suggest that diners prefer the mixed green salad.

217

Instructor’s Solutions Manual - Chapter 9

12. H0: people are willing to pay similar prices for spa weekend in the city and in the country H1: people are willing to pay more for a spa weekend in the country α = 0.03 The order of subtraction was (price for country spa weekend – price for city spa weekend). A large W+ provides evidence in favour of H1. p-value = P(W+ ≥ 1751) Since the sample size is > 25, we can use the normal approximation to the sampling distribution of W.

W − µW

σW

⎛ n ( n + 1) ⎞ ⎛ 75(75 + 1) ⎞ W − ⎜ W W 1751 − ⎜ ⎟ ⎟ 326 4 4 ⎝ ⎠ = ⎝ ⎠ = = 1.72 nW ( nW + 1)(2nW + 1) 75(75 + 1)(2(75) + 1) 189.37397 24 24

p-value = P(z ≥ 1.72) = 1- 0.9573 = 0.0427 Fail to reject H0. There is not enough evidence to suggest that people are willing to pay more for a spa weekend in the country than a spa weekend in the city, at the 3% level of significance. 13. There is sufficient evidence to reject the hypothesis that the tasks are completed in the same time with the two programs. There is sufficient evidence, at the 5% level of significance, to suggest that there is a difference in the amount of time it takes to complete tasks with the two programs. The new software would be recommended. The interval (3.9 minutes, 14.3 minutes) probably contains the average reduction in time on task with the new software. 14. H0: There is no difference in the locations of the populations of sales of soup for each package design. H1: There is a difference in location of the population of sales of soup for each package design. α = 0.05

W − µW

σW

⎛ n ( n + 1) ⎞ W − ⎜ W W ⎟ 4 ⎝ ⎠ = nW ( nW + 1)(2nW + 1) 24

⎛ 26( 26 + 1) ⎞ 300 − ⎜ ⎟ 4 ⎝ ⎠ = 124.5 = 3.16 26( 26 + 1)(2( 26) + 1) 39.373214 24

p-value = 2 • P(z ≥ 3.16 ) = 2 • (1 – 0.9992) = 0.0008 < 0.05 Reject H0. There is sufficient evidence to suggest that there is a difference in sales of soup for each package design.

218

Instructor’s Solutions Manual - Chapter 9

15. H0: µD = 0 H1: µD < 0 (for order of subtraction (new business before training – new business after training) α = 0.025 We are told we can assume the differences are normally distributed. First, calculate the differences. Monthly Monthly New Business New Business Staff Member Difference Before Training After Training ($000s) ($000s) Shirley $230 $240 -$10 Tom $150 $165 -$15 Janice $100 $90 $10 Brian $75 $100 -$25 Ed $340 $330 $10 Kim $500 $525 -$25 Using standard formulas, we calculate: x D = -$9.16667 sD = 15.942605 nD = 6

x D − µ D − 9.166673 − 0 = = −1.408 sD 15.942605 6 nD

219

Instructor’s Solutions Manual - Chapter 9

16. ⎛ s ⎞ x D ± critical t−score ⎜ D ⎟ ⎜ n ⎟ ⎝ D ⎠ ⎛ 16.557979 ⎞ − 10.833333 ± 2.571⎜ ⎟ 6 ⎝ ⎠ ( −28.21, 6.542)

Yes, the interval should contain zero, since there was not enough evidence to conclude there was a difference in monthly new business before and after the training. A one-tailed test with α = 0.025 corresponds to a 95% confidence interval, which has 0.025 in each tail. 17.

With non-normal differences, we must use the Wilcoxon Signed Rank Sum Test.

Staff Difference Member Shirley Tom Janice Brian Ed Kim

Absolute Ranks Ranks For Ranks For Ordered Value Of To Be Positive Negative Differences Differences Assigned Differences Differences

-$10 -$15 $10 -$25 $10 -$25

10 15 10 25 10 25

10 10 10 15 25 25 sums

1 2 3 4 5 6 21

2 2

W =4

2 4 5.5 5.5 W =17

220

Instructor’s Solutions Manual - Chapter 9

18. Because these are ranked data, we will use the Sign Test. First, record the differences in the ratings. Taste Test of Yogurt Formulations Taster Recipe 1 Recipe 2 Difference 1 1 2 2 4 1 + 3 2 3 4 5 4 + 5 3 2 + 6 2 1 + 7 3 2 + 8 2 5 9 5 2 + 10 4 3 + H0: p = 0.5 (tasters rate the two formulations of yogurt the same) H1: p ≠ 0.5 (tasters prefer one yogurt over the other) α = 0.025 nST = 10 n+ = 7 n- = 3 P(n- ≤ 3, nST = 10, p=0.5) = 0.172 p-value = 2 • 0.172 = 0.344 > 0.025 Fail to reject H0. There is insufficient evidence to suggest that the tasters prefer one yogurt over the other. 19. H0: µD = 0 H1: µD < 0 (for order of subtraction (completion time with new-style drill – completion time with old-style drill)) α = 0.05 We are told to assume the differences in completion times are normally distributed. We are given x D = -5.2 sD = 12.2 nD = 20 t=

x D − µ D − 5.2 − 0 = = −1.91 sD 12.2 20 nD

221

Instructor’s Solutions Manual - Chapter 9

We refer to the t-table, looking at the row with n-1= 20-1 = 19 degrees of freedom. We see that 1.91 is located between t.050 and t.025. This is a one-tailed test. 0.025 < p-value < 0.05 Reject H0. There is sufficient evidence to suggest that task completion times with the new-style drill are shorter than with the old-style drill. 20. a. We could consider these quantitative data, as we do with scores on a statistics test, and that is how we will proceed here. But an argument could also be made that these are ranked data. It may not be possible to produce scores for self-esteem that are objective and reproducible. b.

Differences in test scores do not appear normal, but are somewhat symmetric.

Differences in Scores on Self-‐Esteem Test, Before and After Seminar 7

Frequency

6 5 4 3 2 1 0 (Score Before Seminar) -‐ (Score After Seminar)

H0: self-esteem test scores are the same before and after the seminar H1: self-esteem test scores are higher after the seminar α = 5% We used the Non Parametric Tools add-in Wilcoxon Signed Rank Sum Test Calculations to get the rank sums shown below.

222

Instructor’s Solutions Manual - Chapter 9

Wilcoxon Signed Rank Sum Test Calculations sample size 18 W+ 52 W-‐ 119

Since sample size is less than 25, we use the tables to estimate the p-value. The order of subtraction was (test scores before the seminar) – (test scores after the seminar). If the seminar increased test scores, these differences would tend to be negative. So, we focus on W-, which is 119. P(W- ≥ 119) is the p-value. This rank sum is lower than any shown in the table. We conclude p-value > 0.054. Fail to reject H0. There is not enough evidence to suggest that the self-esteem test scores are higher after the seminar.

223

Instructor’s Solutions Manual - Chapter 9

21.

A histogram of the differences is shown below.

Differences in Salaries of Business and Computer Studies Graduates 14

Frequency

12 10 8 6 4 2 0

(Salary of Business Graduate) -‐ (Salary of Computer Studies Graduate)

The differences are not perfectly normally distributed. However, the sample size is fairly large, so we will continue with the t-test. H0 : µD = 0 H1 : µD ≠ 0 (The order of subtraction is Business salary – Computer Studies salary.) α = 0.025 We can use Excel functions and the template, as shown below.

yes 4684.629 -‐$ 533.333 30 0 -‐0.623569 0.268893 0.537785

The two-tailed p-value is 0.538 > 0.025. Fail to reject H0. There is insufficient evidence to suggest that salaries of Business grads are different from salaries of Computer Studies grads.

224

Instructor’s Solutions Manual - Chapter 9

22. A histogram of the differences is shown below.

Frequency

Differences in Commuting Times for Workers at a Honda Plant in Alliston 10 9 8 7 6 5 4 3 2 1 0

(Commuting Time in Minutes for 8 AM Arrival) -‐ (Commuting Time in Minutes for 9 AM Arrival)

The differences appear non-normal, but at least somewhat symmetric. We will use the WSRST. H0: commuting times are the same for 8 am start and 9 am start times H1: commuting times are less for the 8 am start time α = 0.04 We use the add-in to compute the rank sums. Wilcoxon Signed Rank Sum Test Calculations sample size 30 W+ 115.5 W349.5

W- is the largest rank sum, which supports H1, given the order of subtraction is (8 am start commuting time – 9 am start commuting time).

225

Instructor’s Solutions Manual - Chapter 9

Since the sample size is large, we can use the normal approximation to the sampling distribution of W. The Excel template is shown below (this could also be done by hand). Making Decisions About Matched Pairs, Quantitative Data, Non-‐Normal Differences (WSRST) Sample Size 30 Is the sample size at least 25? yes Is the histogram of differences symmetric? yes W+ 115.5 W-‐ 349.5 z-‐Score 2.4064957 One-‐Tailed p-‐Value 0.0080532 Two-‐Tailed p-‐Value 0.01610639

This is a one-tailed test. The p-value is 0.008 < 0.04. Reject H0. There is sufficient evidence to suggest that commuting times are lower for the earlier start time. 23. As usual, with a small data set, it is difficult to assess normality. One possible histogram is shown below.

226

Instructor’s Solutions Manual - Chapter 9

The histogram is somewhat skewed to the right, but we will assume normality and proceed. The completed Excel template is shown below.

yes 18.1491 13.1304 23.00 0 3.46966 0.00109 0.00218

The order of subtraction is (Playing Times Before Changes) – (Playing Times After Changes). If the changes increased the speed of play, we would expect a positive difference. The p-value is 0.001 < 0.05. Reject H0. There is enough evidence to conclude that playing times were faster after the changes were made. Note that the change in approach may have caused the faster play. However, it may be that the fact that the course marshal was obviously focused on faster play was the real cause. 24. We have already assessed normality. The completed Excel template for the confidence interval estimate is shown below.

yes 0.9 13.1304 18.1491 23.00 19.6287 6.63215

227

Instructor’s Solutions Manual - Chapter 9

A 90% confidence interval estimate for the difference in playing times is (6.6 minutes, 19.6 minutes). We have 90% confidence that the interval from 6.6 minutes to 19.6 minutes contains the reduction in playing times. 25. Because these are matched pairs of ranked data, we will use the Sign Test. H0: p = 0.5 (employees rate the two presidents the same) H1: p > 0.5 (employees rate the new president higher than the old president) α = 0.04 We can use the Non Parametric Tools Add-In, the Sign Test Calculations, to analyze the ratings. The output is shown below. Sign Test Calculations # of non-‐zero differences # of positive differences # of negative differences

8 6 2

228

Instructor’s Solutions Manual - Chapter 9

We can then use the Sign Test template to complete the hypothesis test. Making Decisions About Matched Pairs, Ranked Data (Sign Test) Number of Non-‐Zero Differences Number of Positive Differences Number of Negative Differences One-‐Tailed p-‐Value Two-‐Tailed p-‐Value

8 6 2 0.14453125 0.2890625

We see that the one-tailed p-value is 0.145 > 0.04. Fail to reject H0. There is not enough evidence to conclude that employees rate the new president higher than the old president. 26. First analyze the data. The histogram of differences looks normal.

Differences in Times to Complete Search Tasks Using Different Search Engines 16 14

Frequency

12 10

8 6 4 2 0 (Time to Complete Search Task in Minutes, Using O ld Search Engine) -‐ (Time to Complete Search Task in Minutes, Using New Search Engine)

229

Instructor’s Solutions Manual - Chapter 9

The completed Excel template for the t-test of µD is shown below. Making Decisions About the Population Mean with a Single Sample Do the sample data appear to be normally distributed? Sample Standard Deviation s Sample Mean Sample Size n Hypothetical Value of Population Mean t-‐Score One-‐Tailed p-‐Value Two-‐Tailed p-‐Value

yes 4.94218 2.61765 34 0 3.08839 0.00203 0.00406

We could also have used the Data Analysis tool, to get the following output. t-‐Test: Paired Two Sample for Means

Mean Variance Observations Pearson Correlation Hypothesized Mean Difference df t Stat P(T<=t) one-‐tail t Critical one-‐tail P(T<=t) two-‐tail t Critical two-‐tail

Time to Time to Complete Complete Search Task, Search Task, Using Old Using New Search Engine Search Engine 15.35294118 12.73529412 40.4171123 32.38235294 34 34 0.668571931 0 33 3.088389544 0.002031766 1.692360258 0.004063531 2.034515287

230

Instructor’s Solutions Manual - Chapter 9

Either way, the result is the same. H0 : µD = 0 H1 : µD > 0 (The order of subtraction is (Time Using Old Search Engine – Time Using New Search Engine) α = 0.05 p-value = 0.004 < 0.05 Reject H0. There is enough evidence to suggest that the times to complete search tasks are longer with the old search engine. b.

First analyze the data. In this case, the differences do not appear normal. However, they do appear fairly symmetric, so we will use the WSRST.

Differences in Times to Complete Search Tasks Using Different Search Engines 12

Frequency

10 8 6 4 2 0

(Time to Complete Search Task in Minutes, Using Old Search Engine) -‐ (Time to Complete Search Task in Minutes, Using New Search Engine)

231

Instructor’s Solutions Manual - Chapter 9

The output from the Non Parametric Tools Add-In for Wilcoxon Signed Rank Sum Test Calculations is shown below.

Wilcoxon Signed Rank Sum Test Calculations sample size 33 W+ 282 W-‐ 279

The completed Excel template is shown below. Making Decisions About Matched Pairs, Quantitative Data, Non-‐Normal Differences (WSRST) Sample Size 33 Is the sample size at least 25? yes Is the histogram of differences yes W+ 282 W-‐ 279 z-‐Score 0.02680174 One-‐Tailed p-‐Value 0.48930893 Two-‐Tailed p-‐Value 0.97861786

A quick look at the template reveals a very high p-value. H0: search task completion times are the same for the old and new search engines H1: search task completion times are lower for the new search engine. α = 0.05 p-value = 0.49 > 0.05 Fail to reject H0. There is not enough evidence to conclude that search times are lower with the new search engine.

232

Instructor’s Solutions Manual - Chapter 9

We will use the Sign Test for these ranked data. Making Decisions About Matched Pairs, Ranked Data (Sign Test) Number of Non-‐Zero Differences Number of Positive Differences Number of Negative Differences One-‐Tailed p-‐Value Two-‐Tailed p-‐Value

29 21 8 0.01206 0.02412

H0: p = 0.5 (users rate the two search engines about the same) H1: p > 0.5 (users prefer the new search engine) α = 0.05 p-value = 0.012 < 0.05 Reject H0. There is enough evidence to conclude that users prefer the new search engine.

233

Instructor’s Solutions Manual - Chapter 10

Chapter 10 Solutions Develop Your Skills 10.1 1. Call the defects on the night shift population 1, and the defects on the day shift population 2. H0 : µ1 - µ2 = 0 H1 : µ1 - µ2 > 0 α = 0.05 x 1 = 35.4, x 2 = 27.8, s1 = 15.3, s2 = 7.9, n1 = 45, n2 = 50 We are told that the population distributions of errors are normal. t=

( x1 − x 2 ) − (µ1 − µ 2 ) s12 s22 + n1 n2 (35.4 − 27.8) − 0

15.32 7.9 2 + 45 50 = 2.992

Degrees of freedom: minimum of (n1 – 1) and (n2 – 1), so minimum (44, 49) = 44. Closest row in the table is for 45 degrees of freedom. p-value < 0.005 Reject H0. There is sufficient evidence to infer that the number of defects is higher on the night shift than on the day shift, on average. Using the Excel template, we find a more exact p-value of 0.00196.

Making Decisions About the Difference in Population Means with Two Independent Samples Do the sample data appear to be normally distributed? Sample 1 Standard Deviation Sample 2 Standard Deviation Sample 1 Mean Sample 2 Mean Sample 1 Size Sample 2 Size Hypothetical Difference in Population Means t-‐Score One-‐Tailed p-‐Value Two-‐Tailed p-‐Value

yes 15.3 7.9 35.4 27.8 45 50 0 2.99245 0.00196 0.00393

234

Instructor’s Solutions Manual - Chapter 10

Call the population of purchases by females population 1, and the purchases by males population 2. H0 : µ1 - µ2 = 0 H1 : µ1 - µ2 > 0 α = 0.025 We start by creating histograms of the sample data to check for normality.

Drugstore Purchases by Females 7

Frequency

6 5 4 3 2 1 0 Value of Purchase

Drugstore Purchases by Males 6

Number of Purchases

5 4 3 2 1 0 Value of Purchase

Neither histogram is perfectly normal, with the purchases by females in particular showing some skewness to the right. Note that these histograms were not designed for comparison (classes are different for each), but to assess normality. We will proceed, but with some caution.

235

Instructor’s Solutions Manual - Chapter 10

We have the data in Excel, so can use the Data Analysis tool for the calculations. t-‐Test: Two-‐Sample Assuming Unequal Variances

Mean Variance Observations Hypothesized Mean Difference df t Stat P(T<=t) one-‐tail t Critical one-‐tail P(T<=t) two-‐tail t Critical two-‐tail

Purchases by Purchases by Males Females 32.42933333 27.82428571 85.32606381 23.95908791 15 14 0 22 1.692875747 0.05229717 2.073873058 0.10459434 2.40547274

The p-value for a one-tailed test is 0.0523. We fail to reject H-0. There is insufficient evidence to conclude that average female purchases are higher than average male purchases at this drugstore.

236

Instructor’s Solutions Manual - Chapter 10

Use the Excel template to construct the confidence interval. Of course, this can also

be done manually with the formula (x 1 − x 2 ) ± t − score

s 12 s 22 . + n1 n 2

Confidence Interval Estimate for the Difference in Population Means Do the sample data appear to be normally distributed? yes Sample 1 Standard Deviation 9.23721 Sample 2 Standard Deviation 4.8948 Sample 1 Mean $32.43 Sample 2 Mean $27.82 Sample 1 Size 15 Sample 2 Size 14 Confidence Level (decimal form) 0.95 Upper Confidence Limit 10.2621 Lower Confidence Limit -‐1.052 With 95% confidence, we estimate that the interval (-1.05, 10.26) contains the true average difference in the purchase of females, compared to males, at this drugstore. We expect this interval to contain zero, since we failed to reject the hypothesis that the difference was zero in Exercise 2.

237

Instructor’s Solutions Manual - Chapter 10

Call the population of daily sales last year population 1, and the daily sales this year population 2. H0 : µ1 - µ2 = 0 H1 : µ1 - µ2 < 0 α = 0.03 We start by creating histograms of the sample data to check for normality.

Sample of Hot Dog Vendor's Daily Sales, Last Year 12

Frequency

10 8 6 4 2

0 Daily Sales

Sample of Hotdog Vendor Daily Sales, This Year

Frequency

12 10 8 6 4 2

0 Daily Sales

While neither histogram is perfectly normal, they are approximately normal, and sample sizes are fairly large, at 30 and 35.

238

Instructor’s Solutions Manual - Chapter 10

Again, using Excel, we obtain the following results. t-‐Test: Two-‐Sample Assuming Unequal Variances

Last Year's Daily Sales This Year's Daily Sales Mean Variance Observations Hypothesized Mean Difference df t Stat P(T<=t) one-‐tail t Critical one-‐tail P(T<=t) two-‐tail t Critical two-‐tail

298.2857143 3368.915966 35 0 47 -‐5.36356487 1.21835E-‐06 1.927289911 2.4367E-‐06 2.237973619

356.0666667 593.9954023 30

The p-value is quite small, and certainly less than α = 0.03. Reject H0. There is sufficient evidence to infer that daily sales are higher this year than last year, on average. 5.

Call the listening times of the listeners aged 25 and younger, population 1, and the listening times of the listeners over 25, population 2. H0 : µ1 - µ2 = 0 H1 : µ1 - µ2 ≠ 0 α = 0.05 x 1 = 256.8, x 2 = 218.3, s1 = 50.3, s2 = 92.4, n1 = 30, n2 = 35

( x 1 − x 2 ) − (µ 1 − µ 2 ) s 12 s 22 + n1 n 2 (256.8 − 218.3) − 0

50.3 2 92.4 2 + 30 35 = 2.125

Degrees of freedom, for the by-hand method: minimum (29, 34) = 29. 0.010 • 2 < p-value < 0.025 • 2 0.020 < p-value < 0.05 Since p-value is < α = 0.05, reject H0. There is sufficient evidence to infer that listening habits differ (in terms of average listening time), by age.

239

Instructor’s Solutions Manual - Chapter 10

Develop Your Skills 10.2 6. H0: there is no difference in the locations of the populations of ratings for trainees #1 and #2 H1: there is a difference in the locations of the populations of ratings for trainees #1 and #2 α = 0.025 Since these are ranked data, we must use the Wilcoxon Rank Sum Test. First, we have to see if the distributions are similar in shape and spread. With so few data points, this can be difficult to see. Below are two quick dot plots (you could also do a histogram with Excel) that reveal similarities in shape and spread. Ratings for Trainee #1

* *

* * *

* *

Ratings for Trainee #2

* 1

* *

* * * *

The table below illustrates the ranking process.

Trainee #1 1 5 3 2 3 4 4 5 4

Performance Ratings ordered ranks Trainee #2 ordered ranks ratings ratings 1 1 4 2 2.5 2 2.5 5 4 8 3 4.5 4 4 8 3 4.5 5 5 13.5 4 8 6 5 13.5 4 8 2 5 13.5 4 8 5 5 13.5 5 13.5 5 6 17 5 13.5 rank sum 63.5 rank sum 89.5

240

Instructor’s Solutions Manual - Chapter 10

We need only calculate the rank sum for the smallest sample, which is the ratings for Trainee #2, but both are shown here. Note that the tables are set up to for W1 to be calculated from the smallest sample, so W1 = 89.5 here (even though these are the ratings for Trainee #2). So, n1 = 8, and n2 = 9. Because sample sizes are below 10, we must use the tables to estimate the p-value. We see from the table that 0.046 < P(W ≥ 89.5) < 0.057 Since this is a two-tailed test, 0.046 • 2 < p-value < 0.057 • 2 0.092 < p-value < 0.114 Fail to reject H0. There is insufficient evidence to infer there is a difference in the locations of the ratings of Trainee #1 and Trainee #2. Note the implications of this result. If you simply looked at the ratings, you probably would have concluded that the ratings for Trainee #1 were better (lower values are better ratings). However, they are not significantly higher, and the difference could just be a result of sampling variability. This means that Trainee #1 should not be promoted over Trainee #2 on the basis of these ratings. Some other criteria will have to be used to decide which trainee to promote.

241

Instructor’s Solutions Manual - Chapter 10

H0: there is no difference in the locations of the populations of distances travelled by the current best-selling golf ball and the new golf ball H1: the population of distances travelled by the current best-selling golf ball is to the left of the population of distances travelled by the new golf ball α = 0.05 First, we must check the histograms for normality.

Distances Travelled by Current Best-‐Selling Golf Ball 5

Frequency

4 3 2 1

0 Metres

Distances Travelled by New Golf Ball 4

Frequency

3 2 1 0

Metres

Both histograms are non-normal, but they are similar in shape and spread, so we proceed with the Wilcoxon Rank Sum Test.

242

Instructor’s Solutions Manual - Chapter 10

We will use Excel to analyze these data. Wilcoxon Rank Sum Test Calculations sample 1 size sample 2 size W1 W2

12 15 221 157

We will use the template for the Wilcoxon Rank Sum Test for independent samples.

Making Decisions About Two Population Locations, Two Independent Samples of Non-‐ Normal Quantitative Data or Ranked Data (WRST) Sample 1 Size 12 Sample 2 Size 15 Are both sample sizes at least 10? yes Are the sample histograms similar in shape and spread? yes W1 221 W2 157 z-‐Score (based on W1) 2.58613519 One-‐Tailed p-‐Value 0.00485294 Two-‐Tailed p-‐Value 0.00970589 Again, for consistency, we have selected Sample 1 as the smallest sample, so that we assign W1 as the rank sum of the smallest sample, which contains the distances travelled by the new golf ball. We see that n1 = 12, and n2 = 15. Since both are ≥ 10, we can use the normal approximation to the sampling distribution of W1. If the distances travelled by the new golf ball are longer, we would expect W1 to be high. p-value = P(W1 > 221) = 0.005 The p-value < α = 0.05. Reject H0. There is sufficient evidence to suggest that the population of distances travelled by the current best-selling ball is to the left of the population of distances travelled by the new golf ball. Note this is equivalent to saying the distances travelled by the new golf ball are to the right of the distances travelled by the current best-selling ball.

243

Instructor’s Solutions Manual - Chapter 10

H0: there is no difference in the locations of the flight delays before takeoff before and after the redesign of the airport H1: the location of the population of flight delays before takeoff before the redesign of the airport is to the right of the population of flight delays before takeoff after the redesign of the airport α = 0.05 These are quantitative data, so we must check histograms.

Frequency

14 12 10 8 6 4 2 0

Flight Delays Before Takeoff, Before Airport Redesign

Minutes

Frequency

16 14 12 10 8 6 4 2 0

Flight Delays Before Takeoff, After Airport Redesign

Minutes

The histogram for the flight delays after the airport redesign appears non-normal. The data sets are not that similar in shape and spread. We will use the Wilcoxon Rank Sum Test, but we will be cautious about drawing conclusions about location only.

244

Instructor’s Solutions Manual - Chapter 10

Since the data are available in Excel, we can use the Wilcoxon Rank Sum Test Calculations tool to get the rank sums, and then use the template to calculate the pvalue. The results are shown below. Wilcoxon Rank Sum Test Calculations sample 1 size 35 sample 2 size 35 W1 1352.5 W2 1132.5

Making Decisions About Two Population Locations, Two Independent Samples of Non-‐ Normal Quantitative Data or Ranked Data (WRST) Sample 1 Size 35 Sample 2 Size 35 Are both sample sizes at least 10? yes Are the sample histograms similar in shape and spread? no W1 1352.5 W2 1132.5 z-‐Score (based on W1) 1.29207014 One-‐Tailed p-‐Value 0.09816643 Two-‐Tailed p-‐Value 0.19633286 This is a one-tailed test, so p-value = 0.098 > α = 0.05. There is insufficient evidence to infer that the population of flight delays before takeoff before the airport redesign is to the right of the population of flight delays before takeoff after the airport redesign.

245

Instructor’s Solutions Manual - Chapter 10

H0: there is no difference in the locations of the weight losses for young women aged 18-25 who take the diet pill, compared with those who do not take the diet pill H1: the location of the population of weight losses for the young women aged 18-25 who take the diet pill is to the right of the population of weight losses of the young women who do not take the diet pill α = 0.04 We are told the distributions of weight-loss are non-normal, and that both are skewed to the right, so there is similarity in shape of the distributions. No indication is given of the spread of the data, so we will assume similar spreads, noting that our conclusions may not be valid if this is not the case. We are given the rank sums, and can proceed manually, or use the Excel template. The completed Excel template is shown below. Of course, you could also do this calculation with the formulas.

Making Decisions About Two Population Locations, Two Independent Samples of Non-‐ Normal Quantitative Data or Ranked Data (WRST) Sample 1 Size 25 Sample 2 Size 25 Are both sample sizes at least 10? yes Are the sample histograms similar in shape and spread? yes W1 700 W2 575 z-‐Score (based on W1) 1.21267813 One-‐Tailed p-‐Value 0.11262645 Two-‐Tailed p-‐Value 0.22525291 If the weight losses with the diet pill are higher, we would expect W1 to be high. The p-value for a one-tailed test is 0.113, which is > α = 0.04. Fail to reject H0. There is insufficient evidence to infer that the population of weight losses of the young women aged 18-25 who took the diet pill is to the right of the population of weight losses for those who did not take the diet pill.

246

Instructor’s Solutions Manual - Chapter 10

10. H0: there is no difference in the locations of the populations of food ratings by weeknight and weekend diners at a restaurant H1: there is a difference in the locations of the populations of food ratings by weeknight and weekend diners at a restaurant α = 0.05 These are ranked data, so we must examine the distributions for similarity in shape and spread. The dot plots below (created simply in Excel) illustrate. Dot Plot for Ratings of Weeknight Diners * * * * 1

* * * 2

* 4

* 5

Dot Plot for Ratings of Weekend Diners * * * * 1

* * 2

* * * 3

There is some similarity in shape, as both dot plots are skewed to the right. However, there is much less variability in the ratings of the weekend diners. We will proceed with the Wilcoxon Rank Sum Test, but we must be cautious about making conclusions about location.

247

Instructor’s Solutions Manual - Chapter 10

The assignment of ranks is illustrated in the following table.

Rating by Weeknight Diners

Ordered Ratings

4 5 1 2 1 2 2 1 1

1 1 1 1 2 2 2 4 5

Rank 4.5 4.5 4.5 4.5 11 11 11 17 18 86

Ratings by Weekend Diners

Ordered Ratings 1 3 2 1 1 1 3 2 3

1 1 1 1 2 2 3 3 3

Rank 4.5 4.5 4.5 4.5 11 11 15 15 15 85

If there was a difference in the food ratings by weeknight and weekend diners at the restaurant, we would expect W1 and W2 to be different. They are very similar here. p-value = 2 • P(W1 > 86) Since both samples are of size 9, we must use the tables to approximate the p-value. The closest value in the table to 86 is 104, so we can be sure P(W1 > 86) > 0.057. This means the p-value > 2 • 0.057 = 0.114. Fail to reject H0. There is not enough evidence to conclude there is a difference in the food ratings by weekend and weeknight diners at the restaurant. Chapter Review Exercises Throughout these exercises, it is often possible to do the calculations manually, or with Excel. Manual calculations are sometimes illustrated, and when they are not, the results should be close to the Excel output. 1.

It is preferable to use the t-test, if the necessary conditions are met, because it is harder to reject the null hypothesis with the Wilcoxon Rank Sum Test. The t-test uses all of the information available from the sample data, while the WRST uses the ranks, not the actual values. Any time we can use the actual values to make a decision, we should.

248

Instructor’s Solutions Manual - Chapter 10

If population 1 was to the right of population 2, we would expect the values in sample 1 to be higher than the values in sample 2. As a result, the rank sum for sample 1 should be larger than for sample 2. However, when there are 10 observations in each sample (so 20 values have to be ranked), the ranks have to add up to 210. This means that the rank sum for sample 2 has to equal 210 – 78 = 132. This tells us that the values in sample 1 are generally smaller and to the left of the values in sample 2. Therefore, there is no evidence that population 1 is to the right of population 2. The p-value here would be 1 – P(W1 ≤ 78) = 1 – 0.022 = 0.978. Be sure that you think about what the rank sums are telling you. This sample result would be highly unexpected, but if you didn't think about it, you might slip and draw exactly the wrong conclusion!

When samples are different sizes, they will tend to have different rank sums, even if they come from equivalent populations. The smaller sample will have a smaller rank sum, simply because there are fewer data points. So, when comparing rank sums, we have to take this into consideration. The table is based on the rank sum being calculated from the smallest of the two samples. The conclusions could be wrong if you mistakenly calculate W from the larger sample.

The unequal-variances version is preferred because: i. The unequal-variances version of the t-test will lead to the right decision, even if the variances are in fact equal (with very few exceptions). ii. It can be hard to determine if variances are in fact equal, especially with small sample sizes. Really, you should do another sample to test for equal variances. Remember, the more times you skate across the same frozen lake, the more likely you are to observe a rare event—falling in!—and the greater the chance of a Type I error. iii. If you mistakenly assume that variances are equal when they are not, results will be unreliable, particularly when sample sizes are unequal (and especially when the smaller sample has the larger variance).

The Excel template is preferred because the t-score will be more accurate than the one used for the manual calculation.

Call the times managers spent on email in the past population 1, and the times managers spend on emails after the new procedures have been implemented population 2. H0 : µ1 - µ2 = 0 H1 : µ1 - µ2 > 0 α = 0.05 x 1 = 49.2, x 2 = 39.6, s1 = 22.3, s2 = 10.6, n1 = 27, n2 = 25 We are told that the population distributions of errors are normal.

249

Instructor’s Solutions Manual - Chapter 10

( x1 − x 2 ) − (µ1 − µ 2 ) s12 s22 + n1 n2 ( 49.2 − 39.6) − 0

22.32 10.6 2 + 27 25 = 2.006

Degrees of freedom: minimum of (n1 – 1) and (n2 – 1), so minimum (26, 24) =24. Using the table for 24 degrees of freedom, we see 0.025 < p-value < 0.05 Reject H0. There is sufficient evidence to infer that the average time spent by managers on email was lower after the new procedures were implemented. 7.

With 90% confidence, we estimate that the interval (1.52 minutes, 17.68 minutes) contains the true reduction in the average amount of time managers spend on email after the new procedures. The completed Excel template is shown below. Of course, this could also be done manually, using the formula (x 1 − x 2 ) ± t − score

s 12 s 22 . + n1 n 2

Confidence Interval Estimate for the Difference in Population Means Do the sample data appear to be normally distributed? yes Sample 1 Standard Deviation 22.3 Sample 2 Standard Deviation 10.6 Sample 1 Mean 49.2 Sample 2 Mean 39.6 Sample 1 Size 27 Sample 2 Size 25 Confidence Level (decimal form) 0.9 Upper Confidence Limit 17.6756 Lower Confidence Limit 1.52438

250

Instructor’s Solutions Manual - Chapter 10

Call the hours spent doing unpaid work around the home by men in 2000 population 1, and the hours spent doing such work in 2009 population 2. Since it does not appear that the same men were involved in the surveys, we will treat these as independent samples. H0 : µ1 - µ2 = 0 H1 : µ1 - µ2 < 0 α = 0.025 x 1 = 2.2, x 2 = 2.6, s1 = 0.6, s2 = 1.3, n1 = 55, n2 = 55 We are told that the samples appear normally distributed, and so will assume the population distributions are. We can proceed to do the calculations manually, or with the Excel template. The completed Excel template is shown below.

yes 0.6 1.3 2.2 2.6 55 55 0 -‐2.0719 0.02083 0.04167

This is a one-tailed test, so the p-value is 0.021 < α = 0.025. Reject H0. There is sufficient evidence to infer that the average number of hours men spend doing unpaid work around the home has increased in 2009, compared with 2000.

251

Instructor’s Solutions Manual - Chapter 10

This question can be done manually with the formula, or with the Excel template. The completed template is shown below.

We have 95% confidence that the interval (-0.78 hours, -0.02 hours) contains the change in the average amount of time men spend doing unpaid work around the house in 2000, compared with 2009. This means that (0.02 hours, 0.78 hours) contains the increase in the average amount of time spend doing unpaid work around the house in 2009 compared with 2000. 10. H0: there is no difference in the locations of the population of ratings of the appearance of the grocery store H1: the location of the population of ratings of the appearance of the grocery store six months ago is different from the location of the population of current ratings of the grocery store α = 0.05 These are ranked data, so we must examine the distributions for similarity in shape and spread. The diagrams below illustrate. Ratings for Grocery Store Appearance Six Months Ago * 1

* 2

* * 3

* * * * 4

* * * * 5

252

Instructor’s Solutions Manual - Chapter 10

Current Ratings for Grocery Store Appearance * 1

* * 3

* * * * 4

* * * * 5

The ratings appear to be similar in shape and spread. The assignment of ranks is illustrated below.

Appearance Ratings Six Months Ago 5 4 5 4 2 1 4 4 3 3 5 5

Ordered Current Appearance Ranks Ratings Ratings 1 1.5 2 3 3 5.5 3 5.5 4 11.5 4 11.5 4 11.5 4 11.5 5 19.5 5 19.5 5 19.5 5 19.5 W2 139.5

1 5 4 5 4 4 5 3 5 4 3

Ordered Ranks Ratings 1 1.5 3 5.5 3 5.5 4 11.5 4 11.5 4 11.5 4 11.5 5 19.5 5 19.5 5 19.5 5 19.5 W1 136.5

As usual, we focus on the rank sum of the smallest sample, which contains the current ratings for grocery store appearance. n1 = 11, n2 = 12, W1 = 136.5 Since both sample sizes are larger than 10, we can use the normal approximation to the sampling distribution of W1.

253

Instructor’s Solutions Manual - Chapter 10

σ W1 =

n 1 n 2 (n 1 + n 2 + 1) 12

11(12)(11 + 12 + 1) 12 = 16.24807681 =

µ W1 n 1 (n 1 + n 2 + 1) 2 11(11 + 12 + 1) = 2 = 132 =

W1 − µW1

σW

136.5 − 132 16.24807681 = 0.28 =

This is a two-tailed test. If the ratings are different, then W1 would be high. The pvalue will be 2 • P(W1 ≥ 136.5) = 2 • P(z ≥ 0.28) = 2 • (1 – 0.6103) = 0.7794. Fail to reject H0. There is insufficient evidence to infer that there is a difference between the locations of the populations of grocery store ratings for appearance six months ago and currently.

254

Instructor’s Solutions Manual - Chapter 10

Of course, you could also use the Excel template to do these calculations. It is shown below.

Making Decisions About Two Population Locations, Two Independent Samples of Non-‐ Normal Quantitative Data or Ranked Data (WRST) Sample 1 Size 11 Sample 2 Size 12 Are both sample sizes at least 10? yes Are the sample histograms similar in shape and spread? yes W1 136.5 W2 139.5 z-‐Score (based on W1) 0.27695585 One-‐Tailed p-‐Value 0.390907 Two-‐Tailed p-‐Value 0.781814

11. First, realize these are matched-pairs data. Prices are for the same book each year. (Remember to think about whether you have independent or matched-pairs samples, because the techniques for each are different.) Next check to see if the differences are normally distributed. One possible histogram of differences is shown below.

Frequency

Book Price Comparison 9 8 7 6 5 4 3 2 1 0 (Book Price Last Year) -‐ (Book Price This Year)

255

Instructor’s Solutions Manual - Chapter 10

The histogram is skewed to the right, but somewhat normal in shape. The results of the Data Analysis tool for the t-test are shown below. t-‐Test: Paired Two Sample for Means

Mean Variance Observations Pearson Correlation Hypothesized Mean Difference df t Stat P(T<=t) one-‐tail t Critical one-‐tail P(T<=t) two-‐tail t Critical two-‐tail

Book Price Book Price Last Year This Year 14.54 12.426 30.37956842 30.895162 20 20 0.879011986 0 19 3.471780487 0.001276846 1.729132792 0.002553692 2.09302405

H0 : µD = 0 H1 : µD > 0 (The order of subtraction is (book price last year) – (book price this year). If book prices have decreased, this difference would be positive, on average.) α = 0.05 The p-value is 0.001 < 0.05. Reject H0. There is enough evidence to suggest that book prices are lower this year than last year. Note that this is not a random sample. These are books that are of interest to this particular consumer. We should be cautious about drawing a conclusion about all books, based on these data.

256

Instructor’s Solutions Manual - Chapter 10

12. We are provided with summary data, and so can proceed either manually or with the Excel template. We are told that the sample data are normally distributed, so the ttest of the difference in means is appropriate. We will refer to the population of the number of exercises required to master the topic, according to professors, as population 1. The population of the number of exercises required, according to the students’ experience, as population 2. We are asked if the professors have unrealistic expectation of the number of exercises that students need to master the topic. We interpret this to mean “unrealistically high”. In this case, the alternative hypothesis will be that µ1 - µ2 > 0. H0 : µ1 - µ2 = 0 H1 : µ1 - µ2 > 0 α = 0.01 x 1 = 19.2, x 2 = 12.3, s1 = 5.2, s2 = 3.6, n1 = 15, n2 = 20 We are told the sample data appear normally distributed. The completed Excel template is shown below.

yes 5.2 3.6 19.2 12.3 15 20 0 4.40765 0.0001 0.0002

The one tailed p-value is 0.0001, which is less than 1%. Reject H0. There is sufficient evidence to infer that professors have unrealistically high expectations of the number of exercises that students need to do to master this topic.

257

Instructor’s Solutions Manual - Chapter 10

13. The completed Excel template is shown below.

We have 99% confidence that the interval (2.5, 11.3) contains the true overestimation of the number of exercises required to master this topic, compared to the actual experience of students. We would not particularly expect this interval to contain zero, since the hypothesis test in Exercise 12 concluded that professors have higher expectations about the number of exercises required to master a topic, compared with students. The 99% confidence interval is wider than the interval that directly corresponds to the hypothesis test in exercise 12 (the tail area there would be 1%; for a 99% confidence interval, there is only ½% in each tail). However, even the wider interval does not contain zero.

258

Instructor’s Solutions Manual - Chapter 10

14. H0: µA - µB = 0 H1 : µA - µB ≠ 0 α = 0.05 x A = 862, x B = 731, sA = 362, sB = 223, nA = 31, nB = 25 We are told the sample data appear normally distributed. The completed Excel template is shown below.

yes 362 223 862 731 31 25 0 1.66151 0.05143 0.10287

The two-tailed p-value is 0.103 > α. Fail to reject H0. There is insufficient evidence to infer there is a difference in the number of pages produced by the two brands of cartridges, under these conditions.

259

Instructor’s Solutions Manual - Chapter 10

15. The completed Excel template is shown below.

We have 90% confidence that the interval (-1.1, 263.19) contains the difference in the number of pages produced by the two brands of printer cartridge, under these conditions.

260

Instructor’s Solutions Manual - Chapter 10

16. We will refer to the population of wait times for ITM support as population 1, and the population of wait times for Dull support as population 2. H0 : µ1 - µ2 = 0 H1 : µ1 - µ2 ≠ 0 α = 0.05 x 1 = 8.5, x 2 = 6.5, s1 = 2.6, s2 = 1.9, n1 = 34, n2 = 36 We are told the sample data appear normally distributed. The completed Excel template is shown below.

yes 2.6 1.9 8.5 6.5 34 36 0 3.65697 0.00027 0.00054

The two-tailed p-value is 0.00054 < α. Reject H0. There is sufficient evidence to infer there is a difference in average wait times for support between the ITM and Dull computers.

261

Instructor’s Solutions Manual - Chapter 10

17. The completed Excel template is shown below.

We have 95% confidence that the interval (0.91 minutes, 3.09 minutes) contains the true extra average wait time for support for the ITM computers, compared with the Dull computers. This confidence interval corresponds directly to the two-tailed hypothesis test in Exercise 16. Since the null hypothesis of no difference was rejected there, we would not expect this confidence interval to contain zero (and it does not).

262

Instructor’s Solutions Manual - Chapter 10

18. Call the minutes the advisor spent with her clients in January a year ago population 1, and the minutes spent with clients this January population 2. H0 : µ1 - µ2 = 0 H1 : µ1 - µ2 < 0 α = 0.03

Frequency

First we must examine the sample data. Histograms indicate the data are approximately normal.

16 14 12 10 8 6 4 2 0

Minutes Financial Advisor Spent with Each Client Last January

Minutes

Minutes Financial Advisor Spent with Each Client This January

Frequency

8 6

4 2 0

Minutes

263

Instructor’s Solutions Manual - Chapter 10

The data are available on Excel, so it seems reasonable to use Excel to do the t-test. t-Test: Two-Sample Assuming Unequal Variances Minutes Minutes Spent with Spent with Each Client Each Client Last January This January Mean Variance Observations Hypothesized Mean Difference df t Stat P(T<=t) one-tail t Critical one-tail P(T<=t) two-tail t Critical two-tail

50.63333333 281.7574713 30 0 60 -1.62607466 0.05458761 1.917025767 0.109175221 2.222923468

59 597.8823529 35

This is a one-tailed test. The p-value is 0.0546. This is not less than α = 0.03. Fail to reject H0. There is insufficient evidence to infer that the financial advisor spent more time with her clients this January, compared to last January. In this case, the p-value is on the low side (although not low enough). However, even if there was sufficient evidence to reject H0, this would not necessarily imply that the cause was the increasing complexity of investment products. Since the clients were different, it might simply be that one set of clients had more complex investment needs than the other.

264

Instructor’s Solutions Manual - Chapter 10

19. The completed Excel template is shown below.

Confidence Interval Estimate for the Difference in Population Means Do the sample data appear to be normally distributed? yes Sample 1 Standard Deviation 16.7856 Sample 2 Standard Deviation 24.4516 Sample 1 Mean 50.6333 Sample 2 Mean 59 Sample 1 Size 30 Sample 2 Size 35 Confidence Level (decimal form) 0.97 Upper Confidence Limit 3.07098 Lower Confidence Limit -‐19.804 With 97% confidence, we estimate that the interval (-19.8 minutes, 3.07 minutes) contains the true difference between the average amount of time the advisor spent with her clients in January a year ago, compared with this January.

265

Instructor’s Solutions Manual - Chapter 10

20. Call the amount of time spent by sales reps in a two week period with the old software population 1, and the amount of time spent by sales reps with the new software population 2. H0 : µ1 - µ2 = 0 H1 : µ1 - µ2 > 0 α = 0.04 First we must examine the sample data. Histograms indicate the data are approximately normal, although the sample data for the old software are skewed to the right. Also, sample sizes, at 30 and 35, are fairly large.

Minutes Spent by Sales Reps On Computer, Old Software 12

Frequency

10 8 6 4 2 0 Minutes in a Two-‐Week Period

Minutes Spent by Sales Reps On Computer, New Software 12

Frequency

10 8 6 4 2 0 Minutes in a Two-‐Week Period

266

Instructor’s Solutions Manual - Chapter 10

The Excel output for the t-test is shown below. t-‐Test: Two-‐Sample Assuming Unequal Variances

Mean Variance Observations Hypothesized Mean Difference df t Stat P(T<=t) one-‐tail t Critical one-‐tail P(T<=t) two-‐tail t Critical two-‐tail

Old New Software Software 799.8 608 172199.5 68901.88 30 35 0 48 2.184543 0.016919 1.788547 0.033838 2.111073

This is a one-tailed test, so the p-value is 0.017 < α = 0.04. Reject H0. There is sufficient evidence to infer that the amount of time spent by sales reps over a twoweek period with the old software is more than the amount of time spent by sales reps with the new software. The new software may be the cause of the difference, or it might simply be that the work of the two sets of sales reps was different over the two-week period. If there is much week-to-week variability in the computer work done by sales reps, the test might be conducted over a longer period. As well, it would be interesting to compare the times spent by the 25 sales reps who used the old software for this comparison, with their times when they adopt the new software (a matched pairs comparison).

267

Instructor’s Solutions Manual - Chapter 10

21. The completed Excel template is shown below.

Confidence Interval Estimate for the Difference in Population Means Do the sample data appear to be normally distributed? yes Sample 1 Standard Deviation 414.969 Sample 2 Standard Deviation 262.492 Sample 1 Mean 799.8 Sample 2 Mean 608 Sample 1 Size 30 Sample 2 Size 35 Confidence Level (decimal form) 0.96 Upper Confidence Limit 377.26 Lower Confidence Limit 6.34049 At a 96% confidence level, it is estimated that the interval (6.3 minutes, 377.3 minutes) contains the reduction in the amount of time that sales reps would spend over a two-week period, if the new software was adopted.

268

Instructor’s Solutions Manual - Chapter 10

22. The data are ranked, so the Wilcoxon Rank Sum Test will be used to make the comparisons. H0: there is no difference in the locations of the populations of ratings of high-speed Internet service for the cable TV company and the telephone company H1: there is a difference in the locations of the populations of ratings of high-speed Internet service for the cable TV company and the telephone company α = 0.025 Before we use the Wilcoxon Rank Sum Test, we must examine the data to see we can reasonably assume that the populations are similar in shape and spread. Two possible bar graphs of the data are shown below.

Ratings of Internet Service Provided by the Telephone Company

Frequency

14 12

10 8 6 4 2 0 1 2 3 4 5 1=Very Satisfied, 5=Very Dissatisfied

Frequency

Ratings of Internet Service Provided by the Cable TV Company

14 12 10 8 6 4 2 0

1=Very Satisfied, 5=Very Dissatisfied

The distributions appear similar in shape and spread.

269

Instructor’s Solutions Manual - Chapter 10

Since the data are available in an Excel file, it seems appropriate to do the calculations in Excel. The output of the Wilcoxon Rank Sum Test Calculations is shown below. Wilcoxon Rank Sum Test Calculations sample 1 size 35 sample 2 size 35 W1 1349 W2 1136

The relevant template is shown below.

Making Decisions About Two Population Locations, Two Independent Samples of Non-‐ Normal Quantitative Data or Ranked Data (WRST) Sample 1 Size 35 Sample 2 Size 35 Are both sample sizes at least 10? yes Are the sample histograms similar in shape and spread? yes W1 1349 W2 1136 z-‐Score (based on W1) 1.25095882 One-‐Tailed p-‐Value 0.10547475 Two-‐Tailed p-‐Value 0.2109495 The two-tailed p-value is 0.21095. Fail to reject H0. There is insufficient evidence to infer there is a difference in the locations of the populations of ratings of Internet service by the cable TV company and the telephone company.

270

Instructor’s Solutions Manual - Chapter 10

23. a. The data are ranked, so we consider the Wilcoxon Rank Sum Test. H0: there is no difference in the locations of the populations of ratings of the old instructions and the new instructions for lawnmower assembly H1: the location of the population of ratings of the old instructions is to the right of the population of ratings of the new instructions for lawnmower assembly (a highernumbered rating means greater difficulty) α = 0.05 The requirement is that the distributions are similar in shape and spread. Two graphs of the data are shown below.

Ratings for Old Lawnmower Assembly Instructions Frequency

10 5 0 1

1=Very Easy to Read and Follow, 5=Very Difficult to Read and Follow

Ratings for New Lawnmower Assembly Instructions Frequency

20 15

10 5 0 1

1=Very Easy to Read and Follow, 5=Very Difficult to Read and Follow

271

Instructor’s Solutions Manual - Chapter 10

The distributions are similar in spread, but not in shape. Any conclusion we make from the Wilcoxon Rank Sum Test will be weaker, as a result. The ranking could of course be completed manually. The output from the Wilcoxon Rank Sum Test Calculations is shown below.

Wilcoxon Rank Sum Test Calculations sample 1 size 37 sample 2 size 42 W1 1697 W2 1463

Since both sample sizes are more than 10, we will use the Excel template to estimate p-value.

Making Decisions About Two Population Locations, Two Independent Samples of Non-‐ Normal Quantitative Data or Ranked Data (WRST) Sample 1 Size 37 Sample 2 Size 42 Are both sample sizes at least 10? yes Are the sample histograms similar in shape and spread? no W1 1697 W2 1463 z-‐Score (based on W1) 2.13196395 One-‐Tailed p-‐Value 0.01650491 Two-‐Tailed p-‐Value 0.03300981

p-value = 0.016 Reject H0. There is sufficient evidence to infer that population distributions of ratings for the old and new instructions are different. We have seen that there is a difference in shape, but given the marked differences in the frequencies of the "1" and "5" ratings, we are probably safe to conclude that customers find the new instructions easier to read and follow.

272

Instructor’s Solutions Manual - Chapter 10

These data are quantitative, and the samples are independent. We must check for normality before proceeding. Two possible histograms for the sample data are shown below.

Lawnmower Assembly Times with Old Instructions

Frequency

10 8 6

4 2 0

Minutes

Frequency

Lawnmower Assembly Times with New Instructions

16 14 12 10 8 6 4 2 0

Minutes

Both distributions seem approximately normal. The distribution of times for the new instructions is somewhat skewed to the right. However, samples sizes (37 and 42) are fairly large, and we will proceed with the t-test. H0 : µ1 - µ2 = 0 H1 : µ1 - µ2 > 0 α = 0.04

273

Instructor’s Solutions Manual - Chapter 10

The Excel output for this data set is shown below. t-‐Test: Two-‐Sample Assuming Unequal Variances

Mean Variance Observations Hypothesized Mean Difference df t Stat P(T<=t) one-‐tail t Critical one-‐tail P(T<=t) two-‐tail t Critical two-‐tail

Assembly Assembly Times Times (Minutes) (Minutes) With Old With New Instructions Instructions 64.43243243 47.5952381 379.9189189 216.4907085 37 42 0 67 4.287366909 2.96486E-‐05 1.77764645 5.92972E-‐05 2.094505758

The p-value for the one-tailed test is 0.0000296 < 0.04 Reject H0. There is sufficient evidence to infer the assembly times for consumers using the new instructions are lower than for consumers using the new instructions.

274

Instructor’s Solutions Manual - Chapter 11

Chapter 11 Solutions Develop Your Skills 11.1 1. These data are collected on a random sample of days. They should be independent, unless the locations are close enough to each other that the foot traffic at each would be affected by the same factors. We will assume this is not the case. Histograms show approximate normality.

Daily Foot Traffic at Location 1

Number of Days

12 10 8 6 4 2 0

Number of People

Number of Days

Daily Foot Traffic at Location 2 9 8 7 6 5 4 3 2 1 0

Number of People

275

Instructor’s Solutions Manual - Chapter 11

Number of Days

Daily Foot Traffic at Location 3 10 9 8 7 6 5 4 3 2 1 0

Number of People

The histogram for foot traffic at location 1 shows some right-skewness, but sample sizes are reasonable, and close to the same, so we will assume the population data are normally distributed. The largest variance is 478.7 (for location 2), and the smallest is 257.2 (location 1). The largest variance is less than twice as large as the smallest. So, following our rule, we will assume the population variances are approximately equal. Therefore, these data meet the required conditions for one-way ANOVA.

276

Instructor’s Solutions Manual - Chapter 11

Because we don't know the details of how the cashiers made their sample selection, we cannot know if the sample was truly random or independent. We will assume that the sample data were properly collected. Histograms suggest normality.

Winery Purchases for Customers Under 30 Years of Age Number of Purchases

20 15 10 5 0 Value of Purchase

Number of Purchases

Winery Purchases for Customers Aged 30-‐50 18 16 14 12 10 8 6 4 2 0 Value of Purchase

Number of Purchases

Winery Purchases for Customers Over 50 Years of Age

12 10 8 6 4 2 0 Value of Purchase

277

Instructor’s Solutions Manual - Chapter 11

The largest variance is 652.9, and the smallest is 555.1, so clearly the sample variances are fairly close in value. We will assume that the population variances are approximately equal. These data appear to meet the requirements for one-way ANOVA. 3.

We will presume that the college collected the sample data appropriately, so the data are independent and truly random. The histograms suggest normality.

Annual Salaries of Marketing Graduates Number of Graduates

7 6 5 4 3 2 1 0 Annual Salary

Number of Graduates

Annual Salaries of Accounting Graduates 9 8 7 6 5 4 3 2 1 0 Annual Salary

278

Number of Graduates

Instructor’s Solutions Manual - Chapter 11

8 7 6 5 4 3 2 1 0

Annual Salaries of Human Resources Graduates

Annual Salary

Number of Graduates

Annual Salaries of General Business Graduates 7 6 5 4 3 2 1 0 Annual Salary

The largest variance is 159,729,974, and the smallest is 70,826,421. The ratio of the largest to the smallest is about 2.3, which is meets the requirement (less than four). These data appear to meet the requirements for one-way ANOVA.

279

Instructor’s Solutions Manual - Chapter 11

It appears the data are randomly selected, and independent. The data sets are too small for histograms, but stem-and-leaf displays suggest normality. Route 1 3 3 4 0 5 1 6 0

6 5 4

6 7

Route 2 2 2 3 2 4 6

8 3 9

8 5

Route 3 3 1 4 3 5 3 6 1

6 6 5

9 7

The largest variance is 94, the smallest is 67, for a ratio of largest-to-smallest of about 1.4. This is within the accepted range, so we will assume the population variances are approximately equal. These data appear to meet the requirements for one-way ANOVA. 5.

The histograms appear approximately normal. We have to be a bit cautious about assuming these are random samples. For example, one class may be mostly Accounting students, one may be mostly Marketing students, etc. The students who have selected these programs may have different levels of interest and aptitudes for statistics. We will assume that the classes are approximately randomly selected, in the absence of other information, but should note the caution. The largest variance is not much larger than the smallest variance, so we will assume the population variances are approximately equal.

280

Instructor’s Solutions Manual - Chapter 11

Develop Your Skills 11.2 6. H0: µ1 = µ2 = µ3 H1: At least one µ differs from the others. α = 0.05 nT = 85, n1 = 27, n2 = 30, n3 = 28, k = 3 x1 = 50.5556, x2 = 56.6, x3 = 74.3214 2 s12 = 257.1795, s22 = 478.7310, s3 = 333.5595 SSbetween = 8475.2497, SSwithin = 29,575.9738

We have already checked for normality and equality of variances.

SS between 8475.2497 MSbetween 4237.6249 2 F= = k −1 = = = 11.749 SS within 29575.9738 MS within 360.6826 82 nT − k The F-distribution has 2, 82 degrees of freedom. The closest we can come in the table is 2, 80. We see that the p-value is < 1% (Excel provides a p-value of 0.00003). Reject H0. There is sufficient evidence to conclude that at least one of the locations has a different average number of daily passersby than the others. The Excel output for this data set is shown below. Anova: Single Factor SUMMARY Groups Location 1 Location 2 Location 3

Count 27 30 28

ANOVA Source of Variation Between Groups Within Groups

SS 8475.2497 29575.9738

Total

38051.2235

Sum 1365 1698 2081

Average Variance 50.5556 257.1794872 56.6000 478.7310 74.3214 333.5595238

MS 4237.6249 360.6826074

2 82

F P-‐value 11.749 3.26E-‐05

281

Instructor’s Solutions Manual - Chapter 11

H0 : µ1 = µ2 = µ3 H1: At least one µ differs from the others. α = 0.05 nT = 150, n1 = 50, n2 = 50, n3 = 50, k = 3 x1 = 77.5684, x2 = 119.6708, x3 = 132.4674 2 s12 = 652.9145, s22 = 555.0899, s3 = 625.7846 SSbetween = 82504.4210, SSwithin = 89855.6606

We have already checked for normality and equality of variances. F = 67.5 The F-distribution has 2, 147 degrees of freedom. Excel provides a p-value of approximately zero. Reject H0. There is sufficient evidence to conclude that customers in different age groups make different average purchases. 8.

H0 : µ1 = µ2 = µ3 = µ4 H1: At least one µ differs from the others. α = 0.025 nT = 80, n1 = 20, n2 = 20, n3 = 20, n4 = 20, k = 4 x1 = 51,395, x2 = 71,170, x3 = 56,100, x4 = 53,885 2 s12 = 159,729,973.68, s22 = 70,826,421.05, s3 = 116,576,842.11, s 42 = 76,859,236.84 SSbetween = 4,750,850,500, SSwithin = 8,055,857,000

We have already checked for normality and equality of variances. F = 14.9 The F-distribution has 3, 76 degrees of freedom. Excel provides a p-value of approximately zero. Reject H0. There is sufficient evidence to conclude that at least one of the program streams had an average salary for graduates that differs from that of the other program streams.

282

Instructor’s Solutions Manual - Chapter 11

Anova: Single Factor SUMMARY Groups Marketing Accounting Human Resources General Business

Count 20 20 20 20

ANOVA Source of Variation Between Groups Within Groups

SS 4750850500 8055857000

Total

12806707500

Sum Average 1027900 51395 1423400 71170 1122000 56100 1077700 53885

MS 3 1.58E+09 76 1.06E+08

Variance 159729973.68 70826421.05 116576842.11 76859236.84

F P-‐value 14.94004664 9.77E-‐08

H0 : µ1 = µ2 = µ3 H1: At least one µ differs from the others. α = 0.05 nT = 30, n1 = 10, n2 = 10, n3 = 10, k = 3 x1 = 47, x2 = 34.6, x3 = 48.7 2 s12 = 78.4444, s22 = 67.1556, s3 = 94.0111 SSbetween = 1184.8667, SSwithin = 2156.5

We have already checked for normality and equality of variances. F = 7.4 The F-distribution has 2, 27 degrees of freedom. Excel provides a p-value of 0.0027. Reject H0. There is sufficient evidence to conclude that the average commuting time for at least one of the routes is different from the others. The Excel output is shown below.

283

Instructor’s Solutions Manual - Chapter 11

Anova: Single Factor SUMMARY Groups Route 1 Route 2 Route 3

Count 10 10 10

ANOVA Source of Variation SS Between Groups 1184.86667 Within Groups 2156.5 Total

Sum Average Variance 470 47 78.44444 346 34.6 67.15556 487 48.7 94.01111

3341.36667

MS F P-‐value 2 592.4333 7.417436 0.002708 27 79.87037 29

10. H0: µ1 = µ2 = µ3 H1: At least one µ differs from the others. α = 0.05 nT = 135, n1 = 45, n2 = 45, n3 = 45, k = 3 x1 = 70.1111, x2 = 56.6889, x3 = 54.0667 2 s12 = 212.1010, s22 = 226.5828, s3 = 218.0182 SSbetween = 6666.8444, SSwithin = 28894.8889

We have already checked for normality and equality of variances. F = 15.2 The F-distribution has 2, 132 degrees of freedom. Excel provides a p-value of approximately zero. Reject H0. There is sufficient evidence to conclude that differences in the use of the online software are associated with differences in final grades. We should be cautious about interpreting the results, because although there is evidence of a difference in the average grades, we cannot necessarily attribute the differences in the use of the online software as the cause. There are many potential confounding factors, that is, other factors which could have an effect on the final grades.

284

Instructor’s Solutions Manual - Chapter 11

Develop Your Skills 11.3 11. Completed Excel templates are shown below. For locations 1 and 3: Tukey-‐Kramer Confidence Interval Was the null hypothesis rejected in the ANOVA test? x-‐bar i x-‐bar j ni

yes

nj q (from Appendix 7) MSwithin Upper Confidence Limit Lower Confidence Limit

28 3.4 360.682607 -‐11.4505171 -‐36.0812289

50.5556 74.3214 27

For locations 2 and 3: Tukey-‐Kramer Confidence Interval Was the null hypothesis rejected in the ANOVA test? x-‐bar i x-‐bar j ni

yes

nj q (from Appendix 7) MSwithin Upper Confidence Limit Lower Confidence Limit

28 3.4 360.682607 -‐5.72364915 -‐29.719208

56.6000 74.3214 30

For locations 1 and 2:

285

Instructor’s Solutions Manual - Chapter 11

Tukey-‐Kramer Confidence Interval Was the null hypothesis rejected in the ANOVA test? x-‐bar i x-‐bar j ni

yes

nj q (from Appendix 7) MSwithin Upper Confidence Limit Lower Confidence Limit

30 3.4 360.682607 6.06771106 -‐18.1565999

50.5556 56.6000 27

The first two confidence intervals do not contain zero, so it appears that the average number of people passing by location 3 is greater than at the other two locations. 12. Completed Excel templates are shown below (to save space, the row checking for rejection of the null hypothesis in ANOVA is not shown). For under 30 and over 50: Tukey-‐Kramer Confidence Interval x-‐bar i x-‐bar j ni nj

q (from Appendix 7) MSwithin

Upper Confidence Limit Lower Confidence Limit

77.568 132.467 50 50 3.36 611.2629973 -‐43.15088123 -‐66.64711877

286

Instructor’s Solutions Manual - Chapter 11

For under 30 and 30-50: Tukey-‐Kramer Confidence Interval x-‐bar i x-‐bar j ni nj

q (from Appendix 7) MSwithin

Upper Confidence Limit Lower Confidence Limit

77.568 119.671 50 50 3.36 611.2629973 -‐30.3542812 -‐53.8505188

For 30-50 and over 50: Tukey-‐Kramer Confidence Interval x-‐bar i x-‐bar j ni nj

q (from Appendix 7) MSwithin

Upper Confidence Limit Lower Confidence Limit

119.671 132.467 50 50 3.36 611.2629973 -‐1.04848123 -‐24.5447188

None of these confidence intervals contains zero. Certainly the highest average purchase is with those over 50. 13. Completed Excel templates are shown below (to save space, the row checking for rejection of the null hypothesis in ANOVA is not shown). Marketing and Accounting: Tukey-‐Kramer Confidence Interval x-‐bar i x-‐bar j ni nj

q (from Appendix 7) MSwithin

Upper Confidence Limit Lower Confidence Limit

51395.000 71170.000 20 20 3.74 105998118.4210530 -‐11164.94982 -‐28385.05018

287

Instructor’s Solutions Manual - Chapter 11

Accounting and General: Tukey-‐Kramer Confidence Interval x-‐bar i x-‐bar j ni nj

q (from Appendix 7) MSwithin

Upper Confidence Limit Lower Confidence Limit

71170.000 53885.000 20 20 3.74 105998118.4210530 25895.05018 8674.949822

Accounting and Human Resources: Tukey-‐Kramer Confidence Interval x-‐bar i x-‐bar j ni nj

q (from Appendix 7) MSwithin

Upper Confidence Limit Lower Confidence Limit

71170.000 56100.000 20 20 3.74 105998118.4210530 23680.05018 6459.949822

Marketing and Human Resources: Tukey-‐Kramer Confidence Interval x-‐bar i x-‐bar j ni nj

q (from Appendix 7) MSwithin

Upper Confidence Limit Lower Confidence Limit

51395.000 56100.000 20 20 3.74 105998118.4210530 3905.050178 -‐13315.05018

288

Instructor’s Solutions Manual - Chapter 11

sample means for all other pairs are smaller than for this pair, and so we know there will not be a significant difference for the other pairs. To summarize: We have 95% confidence that the interval • ($-28,385.05, $-11,164.95) contains the average difference in the salaries of Marketing graduates, compared to Accounting graduates (in other words, the average salary of Accounting graduates is likely at least $11,164.95 higher) • ($8,674.95, $25,895.05) contains the average difference in the salaries of Accounting graduates, compared to General Business graduates • ($6,459.95, $23,680.05) contains the average difference in the salaries of Accounting graduates, compared to Human Resources graduates. The differences between the average salaries of Human Resources, General Business, and Marketing graduates are not significant. 14. Because of the balanced design, these calculations simplify to:

( xi − x j ) ± q−score

MS within n

79.8703704 10 ( xi − x j ) ± 9.86321 ( xi − x j ) ± 3.49

For route 2 and route 3:

(34.6 − 48.7) ± 9.86321 − 14.1 ± 9.86321 ( −23.96, − 4.24 ) For route 1 and route 2:

( 47 − 34.6) ± 9.86321 12.4 ± 9.86321 ( 2.54, 22.26) For route 1 and 3:

( 47 − 48.7) ± 9.86321 − 1.7 ± 9.86321 ( −11.56, 8.16)

289

Instructor’s Solutions Manual - Chapter 11

Route 2 would be the recommended route. 15. We have to be careful NOT to answer this question merely by inspection! First we recall that the F-test for ANOVA indicated a rejection of the null hypothesis. We have sample evidence that the population means are not all the same. The completed Excel templates are shown below. For assigned quizzes and sample tests only: Tukey-‐Kramer Confidence Interval Was the null hypothesis rejected in the ANOVA test? x-‐bar i x-‐bar j ni

yes

nj q (from Appendix 7) MSwithin Upper Confidence Limit Lower Confidence Limit

45 3.36 218.900673 23.455099 8.63378989

70.1111 54.0667 45

We have 95% confidence that the interval (8.6, 23.5) contains the amount that the average mark for all those who used the online software for assigned quizzes, versus the average mark for all those who used sample tests only. Thus it appears that the average mark is at least 8.6 percent higher for those who use the online software for assigned quizzes.

290

Instructor’s Solutions Manual - Chapter 11

For assigned quizzes for marks, and quizzes for no marks: Tukey-‐Kramer Confidence Interval Was the null hypothesis rejected in the ANOVA test? x-‐bar i x-‐bar j ni

yes

nj q (from Appendix 7) MSwithin Upper Confidence Limit Lower Confidence Limit

45 3.36 218.900673 20.8328768 6.01156767

70.1111 56.6889 45

Once again, it appears that the average marks are higher when the online software is used for assigned quizzes for marks, compared with quizzes for no marks. We have 95% confidence that the interval (6.0, 20.8) contains the amount by which the average marks are higher when the online software is used for assigned quizzes for marks. We cannot conclude that there is a difference in the average marks when the online software is used for quizzes (no marks) or sample tests only. The confidence interval shown below contains zero. Tukey-‐Kramer Confidence Interval Was the null hypothesis rejected in the ANOVA test? x-‐bar i x-‐bar j ni

yes

nj q (from Appendix 7) MSwithin Upper Confidence Limit Lower Confidence Limit

45 3.36 218.900673 10.0328768 -‐4.78843233

56.6889 54.0667 45

291

Instructor’s Solutions Manual - Chapter 11

We have evidence that assigning quizzes for marks results in the best average marks for students. However, as we cautioned before, we cannot be certain of the causeand-effect relationship here, because there are many potentially confounding variables. Chapter Review Exercises 1. The histograms appear approximately normal, although there is some skewness in each one. However, with the large sample sizes, it is not unreasonable to assume the normality requirements are met. 2.

The largest variance is 590.65, and the smallest is 370.02. The ratio of the largest to the smallest is not above 4, so it is reasonable to assume that population variances are approximately equal.

The missing values are shown below in bold type. SUMMARY Groups Class #1 Class #2 Class #3 ANOVA Source of Variation Between Groups Within Groups Total

Count 95 95 95 SS 5596.133333 129367.8526 134963.986

Sum 5840 5088 6075 df 2 282 284

Average 61.47368421 53.55789474 63.94736842 MS 2798.066667 458.7512505

Variance 370.0179171 590.6535274 415.5823068 F 6.099311258

The appropriate F-distribution has 2, 282 degrees of freedom. We refer to the area in the F table for 2, 120 degrees of freedom and see that an F-score of 6.1 has a p-value less than 0.010. Excel provides a more accurate value of 0.0026.

292

Instructor’s Solutions Manual - Chapter 11

Because of the balanced design, these calculations simplify to:

( x i − x j ) ± q−score

MS within n

458.7512505 95 ( x i − x j ) ± 7.273691 ( x i − x j ) ± 3.31

For Class 2 and Class 3: (53.5579 – 63.9474) ± 7.273691 (-17.7, -3.1) We have 95% confidence that the interval (-17.7, -3.1) contains the difference between the average marks of Class 2 and Class 3. In other words, it appears that the average marks of those with the Class 3 professor are at least 3 percentage points higher than the average mark for those with the Class 2 professor. For Class 1 and Class 2: (61.4737– 53.5579) ± 7.273691 (0.6, 15.2) We have 95% confidence that the interval (0.6, 15.2) contains the difference between the average marks of Class 1 and Class 2. In other words, it appears that the average marks of those with the Class 1 professor are at least 0.6 percentage points higher than the average mark for those with the Class 2 professor. For Class 1 and Class 3: (61.4737– 63.9474) ± 7.273691 (-9.7, 4.8) In this case, the interval contains zero, and so there does not appear to be a significant difference between the average marks of those with the Class 1 professor and those with the Class 3 professor. From these comparisons, it appears that the average marks are lower for the Class 2 professor`s classes, and so this class should be avoided. There is no significant difference between the average marks for Class 1 and Class 3. The choice should then be: any professor but the one who lead Class 2.

293

Instructor’s Solutions Manual - Chapter 11

However, this is not a valid method of choosing classes, because there could be many explanations for why the Class 2 marks were significantly lower. It could have to do with the teacher`s expertise, and evaluation methods. But it could also have arisen because of other factors: the students in Class 2 might have been less wellprepared, they may have worked more, or had family responsibilities that prevented them from studying, the class times might have been inconvenient, etc. 6.

The conditions for ANOVA are not met, given the information in these three samples. The distribution of monthly balances for Mastercard owners is quite skewed to the left. The distribution of monthly balances for American Express owners is quite skewed to the right. As well, the variance of the American Express data is more than four times as large as the variance for the Mastercard data. It would not be appropriate to use ANOVA techniques in this case. The Kruskal-Wallis test could be used to compare these samples and draw conclusions about the populations (this technique is not covered in this text).

The requirement for equal variances is met. The largest variance is 14.757, which is only 2.3 times as large as the smallest variance, which is 6.314. The missing values are shown below, in bold type. SUMMARY Groups Employee 1 Employee 2 Employee 3 Employee 4 ANOVA Source of Variation Between Groups Within Groups Total

Count 35 37 32 42 SS 264.6295 1621.124 1885.753

Sum 404 462 357 377 df 3 142 145

Average 11.54286 12.48649 11.15625 8.97619 MS 88.20984 11.41637

Variance 6.314286 14.75676 10.32964 13.536 F 7.726613

The F-distribution will have 3, 142 degrees of freedom. The closest we can come in the table is 3, 120. The closest entry in the table is 3.95, and so we know that the pvalue is < 0.01. At the 5% level of significance, the data do suggest that there are differences in the average number of minutes each employee spends with a customer before making a sale.

294

Instructor’s Solutions Manual - Chapter 11

The completed Excel templates are shown below. Employee 4 and Employee 2: Tukey-‐Kramer Confidence Interval Was the null hypothesis rejected in the ANOVA test? x-‐bar i x-‐bar j ni nj q (from Appendix 7) MSwithin Upper Confidence Limit Lower Confidence Limit

yes 8.97619048 12.4864865 42 37 3.68 11.4163655 -‐1.52792567 -‐5.49266635

We have 95% confidence that the interval (-5.5, -1.5) contains the number of minutes by which the average time spent with customers before making a sale for Employee 4 differs from the average time spent by Employee 2. In other words, we expect the average time spent by Employee 4 is at least 1.5 minutes less than Employee 2. Employee 4 and Employee 1: Tukey-‐Kramer Confidence Interval Was the null hypothesis rejected in the ANOVA test? x-‐bar i x-‐bar j ni nj q (from Appendix 7) MSwithin Upper Confidence Limit Lower Confidence Limit

yes 8.97619048 11.5428571 42 35 3.68 11.4163655 -‐0.55440966 -‐4.57892367

295

Instructor’s Solutions Manual - Chapter 11

We have 95% confidence that the interval (-4.5, -0.5) contains the number of minutes by which the average time spent with customers before making a sale for Employee 4 differs from the average time spent by Employee 1. In other words, we expect the average time spent by Employee 4 is at least 0.5 minutes less than Employee 2. Employee 4 and Employee 3: Tukey-‐Kramer Confidence Interval Was the null hypothesis rejected in the ANOVA test? x-‐bar i x-‐bar j ni nj q (from Appendix 7) MSwithin Upper Confidence Limit Lower Confidence Limit

yes 8.97619048 11.15625 42 32 3.68 11.4163655 -‐0.11699421 -‐4.24312484

We have 95% confidence that the interval (-4.2, -0.1) contains the number of minutes by which the average time spent with customers before making a sale for Employee 4 differs from the average time spent by Employee 3. In other words, we expect the average time spent by Employee 4 is at least 0.1 minutes less than Employee 3.

296

Instructor’s Solutions Manual - Chapter 11

Employee 2 and Employee 3: Tukey-‐Kramer Confidence Interval Was the null hypothesis rejected in the ANOVA test? x-‐bar i x-‐bar j ni nj q (from Appendix 7) MSwithin Upper Confidence Limit Lower Confidence Limit

yes 12.4864865 11.15625 37 32 3.68 11.4163655 3.45272548 -‐0.79225251

Since t his interval contains zero, we conclude there is no significant difference between the average number of minutes Employees 2 and 3 spend with customers before making a sale. At this point, we can conclude that there are no significant differences between the average number of minutes Employees 1, 2 and 3 spend with customers before making a sale (the differences in the sample means are all less than the difference for Employees 2 and 3). This means that the average amount of time spent by Employee 4 is less than the average amount of time spent by the other employees.

297

Instructor’s Solutions Manual - Chapter 11

10. Without further information, we cannot comment on whether the data are independent random samples. In practice, we should never take this on faith. We will assume this condition is met, with a caution that if it isn't, the results may not be reliable. Histograms of the sample data reassure us that the population data are probably normally distributed.

Number of Factory Accidents, Training Method #1 12

Frequency

10 8 6 4 2 0 Number of Accidents

Number of Factory Accidents, Training Method #2 12

Frequency

10 8 6 4 2 0 Number of Accidents

298

Instructor’s Solutions Manual - Chapter 11

Frequency

Number of Factory Accidents, Training Method #3 9 8 7 6 5 4 3 2 1 0 Number of Accidents

The largest variance is 16.5, which is less than twice as large as the smallest variance of 8.3, so we will assume the population variances are approximately equal. It appears that the conditions for one-way ANOVA are met. 11. The Excel output is shown below. Anova: Single Factor SUMMARY Groups Number of Accidents, Training Method #1 Number of Accidents, Training Method #2 Number of Accidents, Training Method #3

Count

Average

Variance

281 9.366667 8.309195

331 11.03333 9.757471

362 12.06667 16.47816

ANOVA Source of Variation Between Groups Within Groups

SS 111.3556 1001.8

Total

1113.156

Sum

MS F P-‐value 2 55.67778 4.835263 0.010205 87 11.51494 89

299

Instructor’s Solutions Manual - Chapter 11

H0 : µ1 = µ2 = µ3 H1: At least one µ differs from the others. α = 0.025 nT = 90, n1 = 30, n2 = 30, n3 = 30 x1 = 9.3667, x2 = 11.0333, x3 = 12.0677 2 s12 = 8.3092, s22 = 9.7575, s3 = 16.4782, SSbetween = 55.6778, SSwithin = 11.5149

We have already checked for normality and equality of variances. F = 4.835 Excel provides a p-value of 0.010205. Reject H0. There is sufficient evidence to conclude that at the average number of factory accidents is different, according to the training method. However, we cannot be certain that it is the training method that caused these differences. There may be other factors involved. 12. Comparing training method #1 and #3: Tukey-‐Kramer Confidence Interval Was the null hypothesis rejected in the ANOVA test? x-‐bar i x-‐bar j ni

yes 9.366667 12.06667 30

nj q (from Appendix 7) MSwithin Upper Confidence Limit Lower Confidence Limit

30 3.4 11.51494 -‐0.59356 -‐4.80644

We have 95% confidence that the interval (-4.8, -0.6) contains the amount by which the average number of factory accidents for training method #1 differs from the average number of factory accidents for training method #3. In other words, it appears that training method #1 is associated with at least 0.6 fewer accidents, on average.

300

Instructor’s Solutions Manual - Chapter 11

Comparing training method #2 and #3: Tukey-‐Kramer Confidence Interval Was the null hypothesis rejected in the ANOVA test? x-‐bar i x-‐bar j ni

yes 11.033 12.067 30

nj q (from Appendix 7) MSwithin Upper Confidence Limit Lower Confidence Limit

30 3.4 11.515 1.0731 -‐3.1398

Since this confidence interval contains zero, there is not a significant difference in the average number of factory accidents associated with training methods #2 and #3. Comparing training method #1 and #2: Tukey-‐Kramer Confidence Interval Was the null hypothesis rejected in the ANOVA test? x-‐bar i x-‐bar j ni

yes 9.36666667 11.0333333 30

nj q (from Appendix 7) MSwithin Upper Confidence Limit Lower Confidence Limit

30 3.4 11.5149425 0.43977374 -‐3.77310707

Since this confidence interval contains zero, there is no significant difference between the average number of accidents that are associated with training methods #1 and #2.

301

Instructor’s Solutions Manual - Chapter 11

Training method #1 compares favourably to training method #3, but otherwise the differences are not significant. This suggests that either training method #2 or #3 is the "worst". Again, we should be cautious, because there may be other explanatory factors. 13. Histograms of the sample data show significant skewness for some of the connection times. The data for early morning and late afternoon connection times appear skewed to the right, and the connection times for the evening are skewed to the left. Sample sizes are also relatively small. As a result, it would probably not be wise to proceed with ANOVA here, as the required conditions do not appear to be met.

Connection Times to Online Mutual Fund Account

Frequency

8 6 4 2 0 Times in Seconds, Late Afternoon

Frequency

Connection Times to Online Mutual Fund Account 10 9 8 7 6 5 4 3 2 1 0 Times in Seconds, Evening

302

Instructor’s Solutions Manual - Chapter 11

Frequency

Connection Times to Online Mutual Fund Account 9 8 7 6 5 4 3 2 1 0 Times in Seconds, Early Afternoon

Frequency

Connection Times to Online Mutual Fund Account 9 8 7 6 5 4 3 2 1 0 Times in Seconds, Early Morning

Frequency

Connection Times to Online Mutual Fund Account 9 8 7 6 5 4 3 2 1 0 Times in Seconds, Mid-‐Day

303

Instructor’s Solutions Manual - Chapter 11

14. We are told the data were collected on a random sample of days. Histograms are shown below.

Frequency

Commuting Times, 6 a.m. Departure 9 8 7 6 5 4 3 2 1 0

Number of Minutes

Frequency

Commuting Times, 7 a.m. Departure 8 7 6 5 4 3 2 1 0

Number of Minutes

Frequency

Commuting Times, 8 a.m. Departure 10 9 8 7 6 5 4 3 2 1 0 Number of Minutes

304

Instructor’s Solutions Manual - Chapter 11

Anova: Single Factor SUMMARY Groups Commuting Time in Minutes, 6 a.m. Departure Commuting Time in Minutes, 7a.m. Departure Commuting Time in Minutes, 8 a.m. Departure

Count

Sum

Average

Variance

1097 45.70833 172.3895

1002 45.54545 175.4026

1063 39.37037 197.5499

ANOVA Source of Variation Between Groups Within Groups

SS 667.0442 12784.71

Total

13451.75

MS F P-‐value F crit 2 333.5221 1.826131 0.168624 3.127676 70 182.6387 72

We see from the output that the variances are fairly close in value, and certainly the largest is less than four times as large as the smallest. It appears that the conditions for ANOVA are met. H0 : µ1 = µ2 = µ3 H1: At least one µ differs from the others. α = 0.05 nT = 73, n1 = 24, n2 = 22, n3 = 27, k = 3 x1 = 45.7, x2 = 45.5, x3 = 39.4 2 s12 = 172.4, s22 = 175.4, s3 = 197.5 SSbetween = 667.0, SSwithin = 12784.7

We have already checked for normality and equality of variances. F = 1.83 Excel provides a p-value of 0.17. Fail to reject H0. There is not enough evidence to conclude that the mean commuting times are not all equal.

305

Instructor’s Solutions Manual - Chapter 11

15. First, check conditions. The data are not actually random samples, but could perhaps be considered to be (see the explanation in the exercise). Histograms of the data are shown below.

Classes Scheduled at 8 a.m. Thursday

Frequency

9 8 7 6 5 4 3 2 1 0

Final Grade

Classes Scheduled at 4 p.m. Friday 12

Frequency

10 8 6 4 2 0

Final Grade

Classes Scheduled at 2 p.m. Wednesday

Frequency

10 8 6 4 2 0

Final Grade

306

Instructor’s Solutions Manual - Chapter 11

The histograms appear reasonably normal. The Excel ANOVA output is shown below. Anova: Single Factor SUMMARY Groups Marks of Class Scheduled for 8 a.m. Thursdays Marks of Class Scheduled for 4 p.m. Fridays Marks of Class Scheduled for 2 p.m Wednesday

Count

Sum

Average

Variance

1257

1650 71.73913 305.2016

1691

ANOVA Source of Variation Between Groups Within Groups

SS 845.314 18142.74

Total

18988.06

62.85 268.0289

67.64

263.99

MS F P-‐value 2 422.657 1.514253 0.22763 65 279.1192 67

We can see from the output that the variances are sufficiently similar to allow us to assume the requirements for ANOVA are met (population variances approximately equal). H0 : µ1 = µ2 = µ3 H1: At least one µ differs from the others. α = 0.01 nT = 78, n1 = 20, n2 = 23, n3 = 25, k = 3 x1 = 62.85, x2 = 71.74, x3 = 67.64 2 s12 = 268.03, s22 = 305.20, s3 = 263.99 SSbetween = 845.31, SSwithin = 18142.74

307

Instructor’s Solutions Manual - Chapter 11

Excel provides a p-value of 0.23. Fail to reject H0. There is not enough evidence to conclude that the mean grades for the students in classes for all three schedules are not equal. It does not appear that the scheduled time for classes affects the marks. However, we should be cautious, because there are many other factors that could be affecting marks. If we could control for them, we would be in a better position to investigate the effects of class schedule on student grades. 16. The first thing to note is that the data are not completely randomly selected. The information is provided by those who enter the contest. These customers may not represent all drugstore customers. Therefore, we must be cautious in interpreting the results. We would need more information about whether most customers entered the contest, before we could apply the results to all customers. As well, we have no way to be sure that the data are correct. Some people may have misrepresented their age or the value of their most recent purchase. With these caveats, we will proceed, but mostly for the practice! Histograms of the data appear approximately normal, and sample sizes, at 45, are fairly large.

Most Recent Drugstore Purchase for Customers Under 18 Years Old Frequency

20 15

10 5 0 Amount of Purchase

Most Recent Drugstore Purchase for Customers 18-‐25 Years Old Frequency

20 15 10

5 0 Amount of Purchase

308

Instructor’s Solutions Manual - Chapter 11

Most Recent Drugstore Purchase for Customers 26-‐34 Years Old 25

Frequency

20 15

10 5 0 Amount of Purchase

Most Recent Drugstore Purchase for Customers 35-‐49 Years Old 20

Frequency

10 5 0 Amount of Purchase

Most Recent Drugstore Purchase for Customers 50-‐74 Years Old 20

Frequency

10 5 0 Amount of Purchase

Most Recent Drugstore Purchase for Customers 75 or More Years Old Frequency

20 15

10 5 0

Amount of Purchase

309

Instructor’s Solutions Manual - Chapter 11

Excel's ANOVA output is shown below. Anova: Single Factor SUMMARY Groups Under 18

Count 45

Sum Average Variance 1055.7 23.46 106.4338

18-‐25

1246.36 27.69689 83.09607

26-‐34

1567.82 34.84044 57.77471

35-‐49 50-‐74 75 and over

45 45 45

1604.26 35.65022 147.776 1647.04 36.60089 121.0066 1172.11 26.04689 78.81046

ANOVA Source of Variation

P-‐value

F crit

Between Groups Within Groups

7179.96 26175.49

5 1435.992 14.48308 1.53E-‐12 2.248208 264 99.1496

Total

33355.46

269

The largest variance is 147.8, and the smallest is 57.8, so the largest variance is less than four times the smallest variance. We will assume that the population variances are sufficiently equal to proceed with ANOVA.

310

Instructor’s Solutions Manual - Chapter 11

17. H0: µ1 = µ2 = µ3 = µ4 = µ5 = µ6 H1: At least one µ differs from the others. α = 0.05 nT = 270, n1 = 45, n2 = 45, n3 = 45, n4 = 45, n5 = 45, n6 = 45, k = 6 x1 = 23.46, x2 = 27.50, x3 = 34.84, x4 = 35.65, x5 = 36.60, x6 = 26.05 2 s12 = 106.43, s22 = 83.10, s3 = 57.77, s 42 = 147.78, s 52 = 121.01, s 62 = 78.81 SSbetween = 7179.961, SSwithin = 26175.49

We have already checked for normality and equality of variances. F = 14.5 Excel provides a p-value of approximately zero. Reject H0. There is enough evidence to conclude that the mean purchases of customers in different age groups are not all equal, when we consider the most recent purchases of those who entered the contest. 18. Because there are so many age groups in this data set, it is not as easy to see where the greatest differences in samples means is, simply by inspection. The easiest way to proceed is to create a table showing the differences in sample means. This is fairly easily constructed in Excel. See an example of such a table, below. Notice that the table shows the absolute value of the differences. Under 18 18-25 26-34 35-49 50-74 Under 18 18-25 26-34 35-49 50-74 75 and over

75 and over

0 4.237 11.380 12.190 13.141 2.587

0.000 7.144 7.953 8.904 1.650

0.000 0.810 0.000 1.760 0.951 0.000 8.794 9.603 10.554

By inspection of the table, we can see that we should start first by comparing the differences of purchases for customers under 18 and 50-74, then under 18 and 35-49, then under 18 and 26-34, and so on. We need the q-value for 6, 264 degrees of freedom. We will use the value for 6, 120 degrees of freedom, as the closest entry in Appendix 7.

311

Instructor’s Solutions Manual - Chapter 11

The completed templates are shown below. Under 18 and 50-74:

Tukey-‐Kramer Confidence Interval Was the null hypothesis rejected in the ANOVA test? x-‐bar i x-‐bar j ni nj q (from Appendix 7) MSwithin Upper Confidence Limit Lower Confidence Limit

yes 23.46 36.6008889 45 45 4.1 99.1496022 -‐7.05501305 -‐19.2267647

We have 95% confidence that the interval (-$19.23, -$7.06) contains the amount by which the average most recent purchase of customers under 18 differs from those aged 50-74 (for those who entered the contest). Under 18 and 35-49: Tukey-‐Kramer Confidence Interval Was the null hypothesis rejected in the ANOVA test? x-‐bar i x-‐bar j ni nj q (from Appendix 7) MSwithin Upper Confidence Limit Lower Confidence Limit

yes 23.46 35.6502222 45 45 4.1 99.1496022 -‐6.10434638 -‐18.2760981

We have 95% confidence that the interval (-$18.27, -$6.10) contains the amount by which the average most recent purchase of customers under 18 differs from those aged 35-49 (for those who entered the contest).

312

Instructor’s Solutions Manual - Chapter 11

Under 18 and 26-34: Tukey-‐Kramer Confidence Interval Was the null hypothesis rejected in the ANOVA test? x-‐bar i x-‐bar j ni nj q (from Appendix 7) MSwithin Upper Confidence Limit Lower Confidence Limit

yes 23.46 34.8404444 45 45 4.1 99.1496022 -‐5.2945686 -‐17.4663203

We have 95% confidence that the interval (-$17.47, -$5.29) contains the amount by which the average most recent purchase of customers under 18 differs from those aged 26-37 (for those who entered the contest). 75 and over and 50-74: Tukey-‐Kramer Confidence Interval Was the null hypothesis rejected in the ANOVA test? x-‐bar i x-‐bar j ni

yes 26.0468889 36.6008889 45

nj q (from Appendix 7) MSwithin Upper Confidence Limit Lower Confidence Limit

45 4.1 99.1496022 -‐4.46812416 -‐16.6398758

We have 95% confidence that the interval (-$16.64, -$4.47) contains the amount by which the average most recent purchase of customers 75 and over differs from those aged 50-74 (for those who entered the contest).

313

Instructor’s Solutions Manual - Chapter 11

75 and over and 35-49: Tukey-‐Kramer Confidence Interval Was the null hypothesis rejected in the ANOVA test? x-‐bar i x-‐bar j ni

yes 26.0468889 35.6502222 45

nj q (from Appendix 7) MSwithin Upper Confidence Limit Lower Confidence Limit

45 4.1 99.1496022 -‐3.51745749 -‐15.6892092

We have 95% confidence that the interval (-$15.69, -$3.52) contains the amount by which the average most recent purchase of customers 75 and over differs from those aged 35-49 (for those who entered the contest). 75 and over and 26-34: 75 and over and 26-‐34 Tukey-‐Kramer Confidence Interval Was the null hypothesis rejected in the ANOVA test? x-‐bar i x-‐bar j ni nj q (from Appendix 7) MSwithin Upper Confidence Limit Lower Confidence Limit

yes 36.6008889 34.8404444 45 45 4.1 99.1496022 7.84632028 -‐4.3254314

At this point, we see the confidence interval contains zero. For this and all the remaining comparisons, there is not a significant difference in the average purchases (for those who entered the contest).

314

Instructor’s Solutions Manual - Chapter 11

19. This question has already been answered, in the discussion of exercise 16. We proceeded, for practice, but these data do not represent a random sample of data about the drugstore customers. 20. Generally speaking, these data do not meet the requirements for ANOVA. The data sets are non-normal, and quite significantly skewed. The histograms for Canada-wide data are shown below.

120

Canadians with Secondary School Graduation Certificate as Highest Level of Schooling

Frequency

100 80 60 40 20 0 Wages and Salaries

Canadians with Trades Certificate or Diploma as Highest Level of Schooling 40 Frequency

30 20 10 0 Wages and Salaries

Canadians with College Certificate or Diploma as Highest Level of Schooling

Frequency

50 40 30 20 10 0 Wages and Salaries

315

Instructor’s Solutions Manual - Chapter 11

21. The professor has selected random samples, from large classes, and there is no immediately obvious reason why the observations would not be independent. The sample data appears to be approximately normally distributed, as the histograms below illustrate.

Students Working <5 Hours per Week on Average

Frequency

10 8 6 4 2 0 Final Mark in Microeconomics

Frequency

Students Working 5-‐ <10 Hours per Week on Average

10 5 0 Final Mark in Microeconomics

Students Working 10 -‐ <15 Hours per Week on Average

Frequency

10 8 6 4 2 0 Final Mark in Microeconomics

316

Instructor’s Solutions Manual - Chapter 11

Students Working 15 -‐ <20 Hours per Week on Average

10 Frequency

8 6

4 2 0 Final Mark in Microeconomics

Students Working 20 or More Hours per Week on Average

Frequency

10 8 6 4 2 0 Final Mark in Microeconomics

Excel's ANOVA output is shown below. Anova: Single Factor SUMMARY Groups Less Than 5 Hours Per Week 5 -‐ <10 Hours Per Week 10 -‐ <15 Hours Per Week 15 -‐ < 20 Hours Per Week 20 or More Hours Per Week

Count 32 34 36 27 24

Sum Average Variance 2076 64.88 355.40 2217 65.21 251.44 1985 55.14 305.32 1557 57.67 284.00 1261 52.54 256.43

ANOVA Source of Variation Between Groups Within Groups

SS 3968.56 43283.32

Total

47251.88

MS F P-‐value 4 992.1399 3.392455 0.010938 148 292.4549 152

317

Instructor’s Solutions Manual - Chapter 11

The ANOVA output shows the largest variance as 355.40, and the smallest as 251.44, and so the largest variance is less than four times as large as the smallest. We will presume that the population variances are approximately equal. H0 : µ1 = µ2 = µ3 = µ4 = µ5 H1: At least one µ differs from the others. α = 0.05 nT = 153, n1 = 32, n2 = 34, n3 = 36, n4 = 27, n5 = 24, k = 5 x1 = 64.88, x2 = 65.21, x3 = 55.14, x4 = 57.67, x5 = 52.54

s12 = 355.40, s22 = 251.44, s32 = 305.32, s 42 = 284.00, s 52 = 256.43 SSbetween = 3968.56, SSwithin = 43283.32 We have already checked for normality and equality of variances. F = 3.39 Excel provides a p-value of 0.010. Reject H0. There is enough evidence to suggest that the mean marks are not all equal. Again, because there are so many possible comparisons, it is useful to calculate all differences in sample means, so we can see which is largest, second-largest, and so on. Such a summary table is shown below (absolute values of differences are shown).

Less Than 5 -‐ <10 10 -‐ <15 15 -‐ < 20 20 or More 5 Hours Hours Per Hours Per Hours Per Hours Per Per Week Week Week Week Week Less Than 5 Hours Per Week 0 5 -‐ <10 Hours Per Week 0.330882 0 10 -‐ <15 Hours Per Week 9.736111 10.06699 0 15 -‐ < 20 Hours Per Week 7.208333 7.539216 2.527778

20 or More Hours Per Week 12.33333 12.66422 2.597222

5.125

318

Instructor’s Solutions Manual - Chapter 11

So, the first comparison will be the marks of students who work 20 or more hours a week and those who work 5 - <10 hours a week, then students who work 20 or more hours a week and those who work less than 5 hours a week, and so on. We need the q-value from Appendix 7 for 5, 148 degrees of freedom. Note that if we use the table value for 5, 120 degrees of freedom, we get the following result. For the marks of those who work 20 or more hours a week, and those who work 5 <10 hours a week: Tukey-‐Kramer Confidence Interval Was the null hypothesis rejected in the ANOVA test? x-‐bar i x-‐bar j ni

yes

nj q (from Appendix 7) MSwithin Upper Confidence Limit Lower Confidence Limit

34 3.92 292.454883 -‐0.02647542 -‐25.3019559

52.54 65.21 24

We have 95% confidence that the interval (-25.30, -0.03) contains the amount by which the average mark of students who work 20 hours or more and those who work 5 to <10 hour per week. Note that although there appears to be a significant difference between the marks of those who work 20 or more hours a week, and those who work 5 - <10 hours a week, the size of the difference may be quite small.

319

Instructor’s Solutions Manual - Chapter 11

For the marks of those who work 20 or more hours a week, and those who work <5 hours a week: Tukey-‐Kramer Confidence Interval Was the null hypothesis rejected in the ANOVA test? x-‐bar i x-‐bar j ni nj q (from Appendix 7) MSwithin Upper Confidence Limit Lower Confidence Limit

yes 52.54 64.88 24 32 3.92 292.454883 0.46678284 -‐25.1334495

This confidence interval contains zero. For this and all remaining comparisons, there is not a significant difference in the average marks.

320

Instructor’s Solutions Manual - Chapter 12

Chapter 12 Solutions Develop Your Skills 12.1 1. We are looking for evidence of a decrease in the proportion of on-time flights after the merger. Call the population of flights before the merger population 1, and the population of flights after the merger population 2. H0: p1 – p2 = 0 H1: p1 – p2 > 0 α = 0.04

p̂ 1 =

85 78 = 0.85 , n1 = 100, p̂ 2 = = 0.78 , n2 = 100 100 100

Sampling is done without replacement, but it is likely that the airline handles many thousands of flights, so we can still use the binomial distribution as the appropriate underlying model. Check for normality of the sampling distribution: n 1 p̂1=100(0.85) = 85 > 10 n 1q̂1=100(1-0.85) = 100(0.15) = 15 > 10 n 2 p̂ 2 =100(0.78) = 78 > 10 n 2 q̂ 2 =100(1-0.78) = 100(0.22) = 22 > 10 Since the null hypothesis is that there is no difference in the proportions, we can pool the sample data to estimate p̂ .

p̂ =

85 + 78 = 0.815 100 + 100

We calculate the z-score as:

(p̂1 − p̂ 2 ) − 0 ⎛ 1 1 ⎞ ⎟⎟ p̂q̂⎜⎜ + ⎝ n 1 n 2 ⎠

(0.85 − 0.78) − 0 1 ⎞ ⎛ 1 (0.815)(1 − 0.815)⎜ + ⎟ ⎝ 100 100 ⎠

0.07 = 1.2747 0.054913568

p-value = P(z ≥ 1.27) = 1 – 0.8980 = 0.102 Since p-value > α, fail to reject H0. There is insufficient evidence to infer that the proportion of on-time flights decreased after the merger.

321

Instructor’s Solutions Manual - Chapter 12

Call the data on use of social network profiles by online Canadians in 2009 sample 1 from population 1, and the data on use of social network profiles by online Canadians 18 months previously sample 2 from population 2. H0: p1 – p2 = 0.10 H1: p1 – p2 > 0.10 α = 0.05

pˆ 1 =

462 = 0.5607 , n1 = 824, pˆ 2 = 0.39 , n2 = 800 824

Sampling is done without replacement, but there are millions of online Canadians, so we can still use the binomial distribution as the appropriate underlying model. Check for normality of the sampling distribution: n 1 p̂1= 462 > 10 n 1q̂1= 824 - 462 = 362 > 10 n 2 p̂ 2 =800(0.39) = 312 > 10 n 2 q̂ 2 =800(1-0.39) = 800(0.61) = 488 > 10 Since the null hypothesis is that there is a 10% difference in the proportions, we cannot pool the sample data to estimate p̂ .

We calculate the z-score as:

( pˆ 1 − pˆ 2 ) − µ pˆ1 − pˆ 2 s pˆ1 − pˆ 2

( pˆ 1 − pˆ 2 ) − µ pˆ1 − pˆ 2 pˆ 1 qˆ1 pˆ 2 qˆ 2 + n1 n2

(0.5607 − 0.39 ) − 0.10 (0.5607)(0.4393) (0.39 )(0.61) + 824 800

= 2.89

p-value = P(z ≥ 2.89) = 1 – 0.9981 = 0.0019 Since p-value < α, reject H0. There is sufficient evidence to infer that the proportion of online Canadians with a social network profile is more than 10% higher in 2009 than it was 18 months previous.

322

Instructor’s Solutions Manual - Chapter 12

Call the data on perceptions of female bank employees sample 1 from population 1, and the data on perceptions of male bank employees sample 2 from population 2. We want to know if the proportion of male employees who felt that female employees had as much opportunity for advancement as male employees is more than 10% higher than the proportion of female employees who thought so. So, we are wondering if p2 is more than 10% higher than p1, that is, if p2 – p1 > 0.10. Rewriting this in standard format, we ask the equivalent question: is p1 –p2 < -0.10? H0: p1 – p2 = -0.10 H1: p1 – p2 < -0.10 α = 0.05

pˆ1 = 0.825 , n1 = 240, pˆ 2 = 0.943 , n2 = 350 Sampling is done without replacement, but Canadian banks are large employers (in 2006, the Royal Bank employed about 69,000 people, for instance), so we can still use the binomial distribution as the appropriate underlying model. Check for normality of the sampling distribution: n 1 p̂1=240(0.825) = 198 > 10 n 1q̂1=240(1-0.825) = 42 > 10 n 2 p̂ 2 =350(0.943) = 330.05 > 10 n 2 q̂ 2 =350(1-0.943) =19.95 > 10 The null hypothesis is that there is 10% difference in the proportions, so we cannot pool the sample data to estimate p̂ . We calculate the z-score as:

(pˆ1 − pˆ2 ) − µ pˆ1 − pˆ2 pˆ1qˆ1 pˆ2qˆ2 + n1 n2

(0.825 − 0.943) − (−0.10) = −0.655 (0.825)(0.175) (0.943)(0.057) + 240 350

p-value = P(z ≤ -0.66) = 0.2546 Since p-value > α, fail to reject H0. There is not enough evidence to infer that the proportion of male employees who felt that female employees had as much opportunity for advancement as male employees was more than 10% higher than the proportion of female employees who thought so.

323

Instructor’s Solutions Manual - Chapter 12

Call the data on customers told about the extended warranty by cashiers sample 1 from population 1. Call the data on customers exposed to the display at the checkout sample 2 from population 2. H0: p1 – p2 = 0 H1: p1 – p2 ≠ 0 α = 0.10 We presume the store has many thousands of customers, so although we are sampling without replacement, we can still use the binomial distribution as the appropriate underlying model. The data are available in Excel, so we will use Excel to do this problem. First the raw data must be organized. The Excel output from the Histogram tool is shown below. Cashier Bin 0 1

Frequency 122 28

Display Bin 0 1

Frequency 145 55

We can then use the Excel template to proceed. The output is shown below.

Making Decisions About Two Population Proportions Sample 1 Size Sample 2 Size Sample 1 Proportion Sample 2 Proportion n1 • p1hat n1 • q1hat n2 • p2hat

150 200 0.18666667 0.275 28 122 55

n2 • q2hat 145 Are np and nq >=10? yes Hypothesized Difference in Population Proportions, p1-‐p2 (decimal form) 0 z-‐Score -‐1.92275784 One-‐Tailed p-‐Value 0.02725523 Two-‐Tailed p-‐Value 0.05451047

324

Instructor’s Solutions Manual - Chapter 12

This is a two-tailed test, so the appropriate p-value is 0.0545. Since this is less than α, reject H0. There is sufficient evidence to infer there is a difference in the proportion of customers who buy the extended warranty when exposed to promotion by a display or informed by the cashier. 5.

We will use Excel to calculate this confidence interval (it could also be done by hand, based on the information acquired in Exercise 4 above). The Excel template is shown below.

Confidence Interval Estimate for the Difference in Population Proportions Confidence Level (decimal form) Sample 1 Proportion Sample 2 Proportion Sample 1 Size Sample 2 Size n1 • p1hat n1 • q1hat n2 • p2hat

0.9 0.18667 0.275 150 200 28 122 55

n2 • q2hat Are np and nq >=10?

145 yes

Upper Confidence Limit Lower Confidence Limit

-‐0.0146 -‐0.1621

325

Instructor’s Solutions Manual - Chapter 12

Develop Your Skills 12.2 6. First, summarize the data, and calculate expected values. See the table below. Past % Observed Expected 0.26 23 19.5 0.37 32 27.75 0.37 20 27.75 75 75

Pay Now Pay In Six Months Pay In One Year

Expected values are calculated as follows. In the past, 26% of customers paid immediately. Out of the 75 customers surveyed, we would expect 26% • 75 = 19.5 customers to pay immediately. The other expected values are calculated in a similar fashion. H0: The distribution of customers according to method of payment is the same now as it was in the past. H1: The distribution of customers according to method of payment is different now, compared to the past. α = 0.05 (given) All expected values are more than 5, so we can proceed. (o − e ) 2 (23 − 19.5) 2 (32 − 27.75) 2 (20 − 27.75) 2 X2 = Σ i i = + + = 3.444 ei 19.5 27.75 27.75 Degrees of freedom = k – 1 = 2. Using the tables, we see that p-value > 0.100. Using CHITEST, we see that p-value = 0.1787. Fail to reject H0. There is insufficient evidence to infer that there has been a change in customers’ preferences for the different payment plans. The graph below summarizes the changes. Customer Preferences for Payment Plans Past Customer Preferences

Current Customer Preferences

45% 40% 35% 30% 25% 20% 15% 10% 5% 0% Pay Now

Pay in Six Months

Pay in One Year

326

Instructor’s Solutions Manual - Chapter 12

H0: The distribution of customers’ brand preferences is as claimed by the previous manager. H1: The distribution of customers’ brand preferences is different from what was claimed by the previous manager. α = 0.05 (given) Expand the table of claimed brand preferences to show expected and observed values, as shown below. Labatt Blue Claimed Preference Expected (for 85 Customers) Observed

Molson Canadian

Kokanee

33%

Labatt Blue Light 8%

25%

19%

Rickard’s Honey Brown 15%

28.05

6.8

21.25

16.15

12.75

All expected values are more than 5, so we can proceed. X2 =Σ

( o i − e i ) 2 ( 29 − 28.05) 2 (6 − 6.8) 2 ( 21 − 21.25) 2 (16 − 16.15) 2 (13 − 12.75) 2 = + + + + ei 28.05 6.8 21.25 16.15 12.75

= 0.1355

Degrees of freedom = k – 1 = 4. Using the tables, we see that p-value > 0.100. Using CHITEST, we see that p-value = 0.9978 . Fail to reject H0. There is insufficient evidence to infer that the distribution of customers’ brand preferences is different from what the previous manager claimed.

327

Instructor’s Solutions Manual - Chapter 12

H0: The die is fair (the probability of occurrence of each side is 1/6). H1: The die is not fair. α = 0.025 (given) Expand the table of observations from repeated tosses of a die to show expected and observed values, as shown below. 1 Spot 18

2 Spots 24

3 Spots 17

4 Spots 25

5 Spots 16

6 Spots 25

Observed Expected (Out Of 125) 20.83333 20.83333 20.83333 20.83333 20.83333 20.83333 All expected values are more than 5. Using the formula as before, we calculate X2 = 4.36. Degrees of freedom = k – 1 = 5. From the table, we see p-value > 0.100. Using CHITEST, we see that p-value = 0.4988. Fail to reject H0. There is insufficient evidence to infer that the die is not fair. John’s troubles are of his own making. We have no way to know if Mary will ever forgive him. 9.

H0: The distribution of customer destination preferences at the travel agency is the same as in the past. H1: The distribution of customer destination preferences at the travel agency is different from the past. α = 0.04 (given) Summarize the data from the random sample and calculate expected values. Canada U.S.

Past Preferences Expected (for a sample of 54) Observed

Caribbean Europe Asia Australia /New Zealand 22% 12% 2% 3%

Other

28%

32%

15.12

17.28

11.88

6.48

1.08

1.62

0.54

328

Instructor’s Solutions Manual - Chapter 12

In this case, there are three expected values < 5 (these are highlighted in the table). It seems logical to combine these categories, and then proceed. The new table of expected and observed values is shown below.

Past Preferences Expected (for a sample of 54) Observed

Canada U.S. 28% 32%

Caribbean Europe Other 22% 12% 6%

15.12

17.28

11.88

6.48

3.24

However, even this change still leaves us with an expected value < 5. We must combine categories again. It is less satisfying to combine Europe with Asia, Australia/New Zealand and Other. However, all of these destinations represent destinations at a significant distance from the North American continent, so there is some sense to combining them The final table of expected and observed values is shown below. Canada U.S. Past Preferences Expected (for a sample of 54) Observed

Caribbean

28%

32%

22%

Europe, Asia, Australia/New Zealand, Other 18%

15.12

17.28

11.88

9.72

Now that all expected values are ≥ 5, we can proceed. Using the formula as before, we calculate X2 = 5.028. Using the tables, with 3 degrees of freedom, we see p-value > 0.100. Using CHITEST, we see that p-value = 0.1697. Fail to reject H0. There is insufficient evidence to infer that there has been a change in customer destination preferences at this travel agency. 10. H0: The distribution of responses to the survey at the local branch is the same as the national benchmarks H1: The distribution of responses to the survey at the local branch is not the same as the national benchmarks α = 0.025; X2 = 20.859; from the tables (4 degree of freedom), p-value < .005 Reject H0. There is sufficient evidence to infer that the distribution of responses to the survey at the local branch differs from the national benchmarks. The graph below gives some indication of where the differences lie. Copyright © 2011 Pearson Canada Inc.

329

Instructor’s Solutions Manual - Chapter 12

Response to a Survey of Financial Services Company Customers: "The staff at my local branch can provide me with good advice on my financial affairs"

60% 50% 40% 30% 20% 10% 0%

National Benchmark Observed Percentage

Strongly Agree

Agree

Neither Agree nor Disagree

Disagree

Strongly Disagree

Develop Your Skills 12.3 11. H0: There is no relationship between the views on the proposed health benefit changes and the type of job held in the organization H1: There is a relationship between the views on the proposed health benefit changes and the type of job held in the organization α = 0.01 The calculations of expected values for a contingency table can be done manually, but are somewhat tedious. We will use Excel’s Non-Parametric Tool for ChiSquared Expected Value Calculations. The Excel output is shown below. Chi-Squared Expected Values Calculations Chi-squared test statistic 16.44338 # of expected values <5 0 p-value 0.002478

Management Professional, Salaried Clerical, Hourly Paid

in favour 19.32377 53.72951 41.94672

opposed 15.62705 43.45082 33.92213

undecided 6.04918 16.81967 13.13115

(The Excel output will allow you to check your manual calculations.) We see that there are no expected values < 5, so we can proceed.

330

Instructor’s Solutions Manual - Chapter 12

The p-value is 0.002478, which is < α = 0.01. Reject H0. There is sufficient evidence to infer that there is a relationship between the views on the proposed health benefits changes and the type of job held in the organization. 12. H0: There is no relationship between hair colour and tendency to use sunscreen. H1: There is a relationship between hair colour and tendency to use sunscreen. α = 0.05 The output of the Excel tool for Chi-Squared Expected Value Calculations is shown below. Chi-squared test statistic # of expected values <5 p-value

21.39291 0 0.011016

Hair Colour Red Blonde Brown Black

Always 22.94828 30.5977 29.2069 38.24713

Usually 23.51724 31.35632 29.93103 39.1954

Once in a While 8.913793 11.88506 11.34483 14.85632

Never 10.62069 14.16092 13.51724 17.70115

All of the expected values are ≥ 5, so we can proceed. The p-value is 0.011 < 0.05. Reject H0. There is evidence of a relationship between hair colour and tendency to use sunscreen.

331

Instructor’s Solutions Manual - Chapter 12

13. H0: There is no relationship between household income and the section of the paper read most closely. H1: There is a relationship between household income and the section of the paper read most closely. α = 0.25 The output of the Excel tool for Chi-Squared Expected Value Calculations is shown below. Chi-Squared Expected Values Calculations Chi-squared test statistic 51.92698 # of expected values <5 0 p-value 1.74E-08

Household Income Under $40,000 $40,000 to $70,000 over $70,000

National and World News 48.96498 52.58171 41.45331

Business 40.74708 43.75681 34.49611

Sports 46.56809 50.00778 39.42412

Arts 19.1751 20.59144 16.23346

Lifestyle 20.54475 22.06226 17.393

All expected values are ≥ 5, so we can proceed. The p-value is 0.0000000174, which is extremely small. Reject H0. There is evidence of a relationship between household income and the section of the paper read most closely. 14. H0: There is no difference in the proportions of students whose first language is English, French or something else among these four schools. H1: There is a difference in the proportions of students whose first language is English, French or something else among these four schools. α = 0.05 The output of the Excel tool for Chi-squared Expected Value Calculations is shown below. Chi-Squared Expected Values Calculations Chi-squared test statistic 9.518055 # of expected values <5 0 p-value 0.14647

First Language English French Other

School #1 44.25 44.25 11.5

School #2 44.25 44.25 11.5

School #3 44.25 44.25 11.5

School #4 44.25 44.25 11.5

332

Instructor’s Solutions Manual - Chapter 12

All expected values are ≥ 5, so we can proceed. The p-value is 0.14647 > α = 0.05. Fail to reject H0. There is insufficient evidence to infer that there are differences in the proportions of students whose first language is English, French, or something else for these four schools. 15. H0: The proportions of students drawn from inside or outside the local area are the same for the Business, Technology and Nursing programs at a college. H1: The proportions of students drawn from inside or outside the local area are different for the Business, Technology and Nursing programs at a college. α = 0.025 The output of the Excel tool for Chi-Squared Expected Value Calculations is shown below. Chi-Squared Expected Values Calculations Chi-squared test statistic 0.823106 # of expected values <5 0 p-value 0.662621 From local area Not from local area

Business 68.46154 81.53846

Technology 45.64103 54.35897

Nursing 63.89744 76.10256

All expected values are ≥ 5, so we can proceed. The p-value is 0.66 > α = 0.025. Fail to reject H0. There is not enough evidence to infer that the proportions of students drawn from inside or outside the local area are different for the Business, Technology and Nursing programs at a college. Chapter Review Questions 1. We can pool data when the null hypothesis is that there is NO difference in the population proportions. If that is the case (as we assume), then we can pool the sample data, because both samples provide estimates of the same proportion of successes. We cannot pool the data when the null hypothesis is that the population proportions differ by 5%, because the sample data are providing estimates of two different proportions of success. 2.

We can't pool the sample data when we are constructing a confidence interval estimate of p1 – p2, because we are not assuming there is no difference in the population proportions. We do not have a null hypothesis in mind when we are estimating the difference in proportions.

H0: p1 – p2 = - 0.10 H1: p1 – p2 < -0.10 This may not be immediately obvious. Remember, the subscript 1 corresponds to last year's results, and the subscript 2 corresponds to this year's results. If the proportion

333

Instructor’s Solutions Manual - Chapter 12

of people who pass this year is more than 10% higher, then when we subtract p1-p2, we will get a negative number, and it will be to the left of -0.10 on the number line. 4.

The Chi-square goodness-of-fit test measures only how closely the observed frequencies match the expected frequencies. The test does not take into account whether the differences are positive or negative (differences are squared in the calculation of the test statistic). Larger differences result in larger values of the Chisquare test statistics. Unusually large values (in the right tail of the distribution) signal that there are significant differences in the distributions.

Repeated tests on the same data set lead to higher chances of Type I error, and are therefore not reliable. A Chi-square test allows us to compare all three proportions simultaneously.

Call the data on members who were taking fitness classes sample 1 from population 1. Call the data on members who were working with a personal trainer sample 2 from population 2. H0: p1 – p2 = 0 H1: p1 – p2 ≠ 0 α = 0.05

p̂ 1 =

38 60 = 0.63333333 , n1 = 60, p̂ 2 = = 0.75 , n2 = 80 60 80

Sampling is done without replacement, but presumably the fitness club has hundreds of members. This is an assumption that we should note before we proceed to use the binomial distribution as the underlying model. Check for normality of the sampling distribution: n 1 p̂1=38 > 10 n 1q̂1=60 - 38 = 22 > 10 n 2 p̂ 2 =60 > 10 n 2 q̂ 2 =80 - 60 = 20 > 10 Since the null hypothesis is that there is no difference in the proportions, we can pool the sample data to estimate p̂ .

p̂ =

38 + 60 = 0.70 60 + 80

334

Instructor’s Solutions Manual - Chapter 12

We calculate the z-score as:

38 − 0.75) − 0 (p̂ 1 − p̂ 2 ) − 0 60 z= = = −1.49 ⎛ 1 ⎛ 1 1 ⎞ 1 ⎞ (0.70)(1 − 0.70)⎜ + ⎟ p̂q̂⎜⎜ + ⎟⎟ ⎝ 60 80 ⎠ ⎝ n 1 n 2 ⎠ (

(Note that p1 is left in fractional form to preserve accuracy for calculations with a calculator.) p-value = 2 • P(z ≤ - 1.49) = 2 • 0.0681 = 0.1362 Since p-value > α, fail to reject H0. There is insufficient evidence to infer that there is a difference in the proportion of new members still working out regularly six months after joining the club, when comparing those who attend fitness classes with those who work out with a personal trainer. 7.

Since the Chi-square test is equivalent to the test of proportions, we expect to get the same answer. First, set up the appropriate contingency table for the data, as shown below.

Taking fitness classes Working with a personal trainer

Still working out in first six months 38 60

Quit working out in first six months 22 20

The setup of the problem is the same, with the same null and alternative hypotheses. The output of the Excel tool for Chi-Squared Expected Value Calculations is shown below. Chi-Squared Expected Values Calculations Chi-squared test statistic # of expected values <5 p-value

2.222222 0 0.136037

Still working out in first six months Taking fitness classes Working with a personal trainer

Quit working out in first six months 42

The p-value is the same as in the previous problem (it differs slightly only because we used the tables for the calculation in the previous question, which involves rounding the z-score to two decimal places). Of course, the conclusion is also the same.

335

Instructor’s Solutions Manual - Chapter 12

Call the data on deliveries by the private courier sample 1 from population 1. Call the data on deliveries by Canada Post sample 2 from population 2. H0: p1 – p2 = 0.05 H1: p1 – p2 > 0.05 α = 0.025

pˆ 1 = 0.89, n1 = 100, p̂ 2 = 0.80 , n2 = 75 Sampling is done without replacement, but presumably both the private courier and Canada Post make a very large number of deliveries, so we can use the binomial distribution as the appropriate underlying model. Check for normality of the sampling distribution: n 1 p̂1= 100(0.89) = 89 > 10 n 1q̂1= 100(1-0.89) = 11 ≥ 10 n 2 p̂ 2 =75(0.80) = 60 > 10 n 2 q̂ 2 =75(1 - 0.80) = 15 > 10 Since the null hypothesis is that there is a difference in the proportions, we cannot pool the sample data. We calculate the z-score as:

( pˆ 1 − pˆ 2 ) − µ pˆ1 − pˆ 2 pˆ 1qˆ1 pˆ 2 qˆ 2 + n1 n2

(0.89 − 0.80) − 0.05 = 0.72 (0.89 )(0.11) (0.80)(0.20) + 100 75

p-value = P(z ≥ 0.72) = 1 – 0.7642 = 0.2358 Since p-value > α, fail to reject H0. There is insufficient evidence to infer that the on-time or early percentage for the private courier is more than 5% higher than Canada Post’s. Using this criterion for the decision, the mail order company should not use the private courier service.

336

Instructor’s Solutions Manual - Chapter 12

Call the data on students who were called by program faculty sample 1 from population 1. Call the data on students who were only sent a package in the mail sample 2 from population 2. H0: p1 – p2 = 0 H1: p1 – p2 > 0 α = 0.025

pˆ 1 =

234 232 = 0.841726618, n1 = 278, pˆ 2 = = 0.76821192 , n2 = 302 278 302

Sampling is done without replacement. We have no information on college enrolment. Sample sizes are fairly large. For these samples to be at most 5% of the relevant populations, the college would have to have about 5560 students who were called by faculty in total, and about 6040 who were sent acceptance packages. This means a fairly large potential first-year enrolment. This is an assumption that we should note before we proceed to use the binomial distribution as the underlying model. Check for normality of the sampling distribution: n 1 p̂1= 234 > 10 n 1q̂1= 278 - 234 = 44 > 10 n 2 p̂ 2 = 232 > 10 n 2 q̂ 2 = 302 - 232 = 70 > 10 Since the null hypothesis is that there is no difference in the proportions, we can pool the sample data to estimate p̂ .

pˆ =

234 + 232 = 0.80344828 278 + 302

We calculate the z-score as:

( pˆ 1 − pˆ 2 ) − 0 ⎛ 1 1 ⎞ pˆ qˆ ⎜⎜ + ⎟⎟ ⎝ n1 n2 ⎠

⎛ 234 232 ⎞ − ⎜ ⎟ − 0 ⎝ 278 302 ⎠ = 2.23 1 ⎞ ⎛ 466 ⎞⎛ 466 ⎞⎛ 1 + ⎜ ⎟⎜1 − ⎟⎜ ⎟ ⎝ 580 ⎠⎝ 580 ⎠⎝ 278 302 ⎠

337

Instructor’s Solutions Manual - Chapter 12

Since p-value < α, reject H0. There is sufficient evidence to infer that the proportion of prospective students who send acceptances is higher when they get calls from program faculty (compared with receiving a package in the mail). 10. This confidence interval could be calculated manually. The output of the Excel template is shown below (manual calculations should be very close). Confidence Interval Estimate for the Difference in Population Proportions Confidence Level (decimal form) Sample 1 Proportion Sample 2 Proportion Sample 1 Size Sample 2 Size n1 • p1hat n1 • q1hat n2 • p2hat n2 • q2hat

0.95 0.84173 0.76821 278 302 234 44 232 70

Are np and nq >=10?

yes

Upper Confidence Limit Lower Confidence Limit

0.13759 0.00944

With 95% confidence, we estimate that the proportion of students who send acceptances when called by program faculty is 0.9% to 13.8% higher than the proportion who send acceptances when they receive only a package in the mail. 11. Call the data on managers who have been sent to conflict resolution training sample 1 from population 1. Call the data on non-managerial employees who have been sent to conflict resolution training sample 2 from population 2. H0: p1 – p2 = 0 H1: p1 – p2 ≠ 0 α = 0.025

p̂ 1 =

36 38 = 0.72 , n1 = 50, p̂ 2 = = 0.50 , n2 = 76 50 76

338

Instructor’s Solutions Manual - Chapter 12

Sampling is done without replacement. We have no information on the total number of employees who have been sent to conflict resolution training. We are told that the company is “large”. For these samples to be at most 5% of the relevant populations, the company would have had to send 1000 managerial employees to the training, and 1520 non-managerial employees. This is an assumption that we should note before we proceed to use the binomial distribution as the underlying model. Check for normality of the sampling distribution: n 1 p̂1= 36 > 10 n 1q̂1= 50 - 36 = 14 > 10 n 2 p̂ 2 = 38 > 10 n 2 q̂ 2 = 76 - 38 = 38 > 10 Since the null hypothesis is that there is no difference in the proportions, we can pool the sample data to estimate p̂ .

p̂ =

36 + 38 = 0.587301587 50 + 76

We calculate the z-score as:

(p̂1 − p̂ 2 ) − 0 ⎛ 1 1 ⎞ p̂q̂⎜⎜ + ⎟⎟ ⎝ n 1 n 2 ⎠

(0.72 − 0.50) − 0 74 ⎞⎛ 1 1 ⎞ ⎛ 74 ⎞⎛ ⎜ ⎟⎜1 − ⎟⎜ + ⎟ ⎝ 126 ⎠⎝ 126 ⎠⎝ 50 76 ⎠

= 2.45

(Note that some proportions are left in fractional form to preserve accuracy for calculations with a calculator.) p-value = 2 • P(z ≥ 2.45) = 2 • (1 - 0.9929) = 2 • 0.0071 = 0.0142 Since p-value < α, reject H0. There is sufficient evidence to infer there is a difference in the proportions of managers and non-managers who thought that conflict resolution training was a waste of time. 12. In the sample, 72% of managers and 50% of non-managers thought the training was a waste of time. There is no way to know why, and this is something that might be worthy of further research. Was the training perceived as a waste of time because the employees felt they did not benefit? If they did not benefit, was this because they learned nothing new, or they thought the training was poorly done? Was it a “waste” of time only because they felt they had more important tasks to complete?

339

Instructor’s Solutions Manual - Chapter 12

The initial research was not really that helpful. It would have been more appropriate to ask the employees what they learned at the training, and whether they were likely to put what they learned into practice. However, generally, the sample results raise a question about whether the training is accomplishing its intended goals. Before continuing to spend money on training, the decision to implement the training should be revisited, with further data collection a possibility. 13. We can set this up as a Chi-square test, with the information organized as in the table below.

Defective Components Non-Defective Components Total

Manufacturer #1 36

Manufacturer #2 30

Manufacturer #3 38

89 125

95 125

87 125

H0: The proportions of defective items are the same for all three manufacturers. H1: The proportions of defective items are different among the three manufacturers. α = 0.05 The output from the Excel tool for Chi-Squared Expected Value Calculations is shown below. Chi-Squared Expected Values Calculations Chi-squared test statistic 1.383764 # of expected values <5 0 p-value 0.500633

Defective Components Non-Defective Components

#1 34.66667 90.33333

#2 34.66667 90.33333

#3 34.66667 90.33333

340

Instructor’s Solutions Manual - Chapter 12

14. Refer to the two plants as Plant 1 and Plant 2. H0: p1 – p2 = 0 H1: p1 – p2 ≠ 0 α = 0.05

p̂ 1 =

23 23 = 0.15333333 , n1 = 150, p̂ 2 = = 0.184 , n2 = 125 150 125

Sampling is done without replacement. We have no information on the total number of employees at the two plants. As long as Plant 1 has 3000 employees, and Plant 2 has 2500 employees, we can still use the binomial distribution as the appropriate underlying model. We note this assumption and proceed. Check for normality of the sampling distribution: n 1 p̂1= 23 > 10 n 1q̂1= 150 - 23 = 127 > 10 n 2 p̂ 2 = 23 > 10 n 2 q̂ 2 = 125 - 23 = 102 > 10 Since the null hypothesis is that there is no difference in the proportions, we can pool the sample data to estimate p̂ .

p̂ =

23 + 23 46 = = 0.16727272727 150 + 125 275

We calculate the z-score as:

⎛ 23 ⎞ − 0.184 ⎟ − 0 ⎜ (p̂ 1 − p̂ 2 ) − 0 ⎝ 150 ⎠ z= = = −0.68 46 46 1 1 ⎛ 1 ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ 1 + ⎟⎟ ⎜ ⎟⎜1 − ⎟⎜ ⎟ p̂q̂⎜⎜ + 275 275 150 125 ⎝ ⎠ ⎝ ⎠ ⎝ ⎠ n n 2 ⎠ ⎝ 1 (Note that some proportions are left in fractional form to preserve accuracy for calculations with a calculator.) p-value =2 • P(z ≤ -0.68) = 2 • 0.2483 = 0.4966 Using Excel, we calculate the exact p-value as 0.497. (See the output from the Excel template for Making Decisions About Two Population Proportions, Qualitative Data, shown below.)

341

Instructor’s Solutions Manual - Chapter 12

Making Decisions About Two Population Proportions Sample 1 Size Sample 2 Size Sample 1 Proportion Sample 2 Proportion n1 • p1hat n1 • q1hat

150 125 0.15333333 0.184 23 127

n2 • p2hat

n2 • q2hat 102 Are np and nq >=10? yes Hypothesized Difference in Population Proportions, p1-‐p2 (decimal form) 0 z-‐Score -‐0.67847976 One-‐Tailed p-‐Value 0.24873378 Two-‐Tailed p-‐Value 0.49746755

Since p-value > α, fail to reject H0. There is insufficient evidence to infer there is a difference in the proportions of employees who had accidents at the two plants. 15. Exercise 14 could also be done as a Chi-square test. First, organize the data as shown below. Employees Who Had An Accident Employees Who Had No Accident Total

Plant 1 23 127 150

Plant 2 23 102 125

H0: The proportions of employees who had accidents are the same at the two plants. H1: The proportions of employees who had accidents are different at the two plants. α = 0.05

342

Instructor’s Solutions Manual - Chapter 12

The output from the Excel tool for Chi-Squared Expected Value Calculations is shown below. Chi-Squared Expected Values Calculations Chi-squared test statistic 0.460335 # of expected values <5 0 p-value 0.497468

Employees Who Had An Accident Employees Who Had No Accident

Plant 1 25.09091 124.9091

Plant 2 20.90909 104.0909

We arrive at the same conclusion as before (as we would expect). Once again, the pvalue is 0.497. Since p-value > α, fail to reject H0. There is insufficient evidence to infer there is a difference in the proportions of employees who had accidents at the two plants. 16. H0: There is no relationship between an individual’s family status and his/her willingness to accept a foreign posting H1: There is a relationship between an individual’s family status and his/her willingness to accept a foreign posting α = 0.05 The output of the Excel tool for Chi-squared Expected Value Calculations is shown below. Chi-Squared Expected Values Calculations Chi-squared test statistic 5.474924 # of expected values <5 0 p-value 0.140146

Family Status Single, No Children Single with Children Partnered, No Children Partnered with Children

Accepted Foreign Posting 45.78947 25.26316 37.89474 41.05263

Declined Foreign Posting 12.21053 6.736842 10.10526 10.94737

All expected values are greater than 5, so we can proceed. The p-value is 0.14, which is greater than α = 0.05. Fail to reject H0. There is not enough evidence to infer that there is a relationship between an individual’s family status and his/her willingness to accept a foreign posting. Copyright © 2011 Pearson Canada Inc.

343

Instructor’s Solutions Manual - Chapter 12

17. H0: The absences are equally distributed across the five working days of the week. H1: The absences are not equally distributed across the five working days of the week. α = 0.05 There are 48 absences in total, in the sample. If the absences are equally distributed across the five working days of the week, then we would expect each of the five days to have 48/5 = 9.6 absences.

X2 =Σ

(oi − ei ) 2 (15 − 9.6) 2 (6 − 9.6) 2 (4 − 9.6) 2 (7 − 9.6) 2 (16 − 9.6) 2 = + + + + = 12.625 ei 9.6 9.6 9.6 9.6 9.6

p-value = P(X2 > 12.625)= 0.013261 (using Excel’s CHITEST). Using the table, for four degrees of freedom, we see 0.010 < P(X2 > 12.625) < 0.025. Reject H0. There is enough evidence to suggest that the absences are not equally distributed across the five working days of the week. 18. H0: There is no relationship between gender and preferred movie type. H1: There is a relationship between gender and preferred movie type. α = 0.04 This problem could be done manually, of course. The output of the Excel tool for Chi-squared Expected Value Calculations is shown below. Chi-Squared Expected Values Calculations Chi-squared test statistic 10.8983 # of expected values <5 0 p-value 0.091571 Favourite Movie Type Action/Adventure Comedy Drama Fantasy Horror Romance Thriller

Male 29.79672 14.43279 21.88197 16.76066 20.95082 19.08852 19.08852

Female 34.20328 16.56721 25.11803 19.23934 24.04918 21.91148 21.91148

All of the expected values are ≥ 5, so we can proceed. The p-value is 0.092, and since this is greater than α = 0.04, fail to reject H0. There is not enough evidence to infer that there is a relationship between gender and preferred movie type.

344

Instructor’s Solutions Manual - Chapter 12

19. H0: The proportions of workers who travel to work via the different methods are the same for the software firm and the accounting firm. H1: The proportions of workers who travel to work via the different methods at the software firm are different from the proportions of workers who travel to work via the different methods at the accounting firm. α = 0.05 The output of the Excel tool for Chi-Squared Expected Value Calculations is shown below. Chi-Squared Expected Values Calculations Chi-squared test statistic n/a # of expected values <5 2 p-value n/a

Software Firm Accounting Firm

By Transit 50.8481 52.1519

In Car 15.3038 15.6962

On Bicycle 9.873418 10.12658

On Foot 1.974684 2.025316

Since some of the expected values are less than 5, we cannot proceed. First we must amalgamate categories in a meaningful way. It seems reasonable to combine the categories of travel by bicycle and by foot, since both of these are self-propelled. The reorganized data set will then be:

Software Firm Accounting Firm

By Transit 51 52

In Car 8 23

On Bicycle Or On Foot 19 5

The new output from the Excel tool for Chi-Squared Expected Value Calculations is shown below. Chi-Squared Expected Values Calculations Chi-squared test statistic 15.41159 # of expected values <5 0 p-value 0.00045

Software Firm Accounting Firm

By Transit 50.8481 52.1519

In Car 15.3038 15.6962

On Bicycle Or On Foot 11.8481 12.1519

345

Instructor’s Solutions Manual - Chapter 12

Since the expected values are now all ≥ 5, we can proceed. The p-value is very small, at 0.00045. We have very convincing evidence that the proportions of workers who travel to work via the different methods at the software firm are different from the proportions of workers who travel to work via the different methods at the accounting firm. 20. H0: The distribution of preferences for beer, wine and other alcoholic drinks is the same for males and females. H1: The distribution of preferences for beer, wine and other alcoholic drinks is different for males and females. α = 0.05 The output of the Excel tool for Chi-Squared Expected Value Calculations is shown below. Chi-‐Squared Expected Values Calculations Chi-‐squared test statistic 0 # of expected values <5 0 p-‐value 1

Beer Male Female

Other Alcoholic Wine Drinks 42 36 22 63 54 33

The p-value is 100% in this case, and the test statistic is zero. There is no evidence to support the hypothesis that the distribution of preferences for beer, wine and other alcoholic drinks is different for males and females, because the proportions of males and females who prefer each drink type is exactly the same, for all types of drinks. 21. H0: The proportions of mixed nuts are as specified. H1: The proportions of mixed nuts are not as specified. α = 0.025 Desired % Observed Number Expected Number (out of 374)

Almonds Peanuts Hazelnuts Cashews Pecans 22% 48% 10% 10% 10% 80 190 36 31 37 82.28

179.52

37.4

346

Instructor’s Solutions Manual - Chapter 12

All of the expected values are ≥ 5, so we can proceed. X2 = 1.827 From the table (degrees of freedom = k – 1 = 4), we see that p-value > 0.100. Using CHITEST, we see that p-value = 0.7676. Fail to reject H0. There is insufficient evidence to infer that the proportions of mixed nuts are not as specified. 22. H0: p = 0.50 H1: p > 0.50 α = 0.025 p̂ = 190/374 = 0.50802139 n = 374 Sampling is done without replacement. The company presumably produces a significant quantity of mixed nuts, so the sample is presumably not more than 5% of the population. This means the binomial distribution is still the appropriate underlying model. In this case, we are presuming that one package of mixed nuts constitutes a random sample. np = 374(0.50) = 187 nq = 374(1 – 0.50) = 187 Both are ≥ 10, so the sampling distribution of p̂ will be approximately normal, with a mean of 0.50, and a standard error of σ p̂ =

P(p̂ ≥

pq (0.50)(0.50) = = 0.025854384 . n 374

190 ) 374

⎛ ⎞ 190 ⎜ − 0.50 ⎟ ⎜ ⎟ = P⎜ z ≥ 374 (0.50)(0.50) ⎟ ⎜⎜ ⎟⎟ 374 ⎝ ⎠ = P(z ≥ 0.31) = 1 − 0.6217 = 0.3783 p-value = 0.3783 > α = 0.025 Fail to reject H0. There is insufficient evidence to infer that there are more than 50% peanuts in the mixed nuts packages.

347

Instructor’s Solutions Manual - Chapter 12

23. H0: The proportions of types of nuts are the same for the two companies. H1: The proportions of types of nuts are different for the two companies. α = 0.05 The output of the Excel tool for Chi-Squared Expected Value Calculations is shown below. Chi-Squared Expected Values Calculations Chi-squared test statistic 2.036656 # of expected values <5 0 p-value 0.729017

Company B Company A

Almonds 161.2716 81.72842

Peanuts 374.973 190.027

Hazelnuts 67.03058 33.96942

Cashews 70.34892 35.65108

Pecans 64.3759 32.6241

All expected values are greater than 5, so we can proceed. The p-value is 0.729, which is greater than α = 0.05. Fail to reject H0. There is not enough evidence to infer that the proportions of types of nuts are different for the two companies. 24. H0: pA – pB = 0 H1: pA – pB ≠ 0 α = 0.05

p̂ A =

375 190 = 0.50802139 , nA = 374, p̂ B = = 0.508130081, nB = 738 374 738

Sampling is done without replacement, but it is likely that both companies produce many packages of mixed nuts. Again, we are assuming that one package is a random sample. Check for normality of the sampling distribution: n A p̂ A =190 > 10 n A q̂ A =184 >10 n B p̂ B =375 >10

n B q̂ B =363 > 10 Since the null hypothesis is that there is no difference in the proportions, we can pool the sample data to estimate p̂ .

348

Instructor’s Solutions Manual - Chapter 12

p̂ =

190 + 375 = 0.508093525 374 + 738

z-score = - 0.003 P(z ≤ -0.003) ≈ 50% Fail to reject H0. There is insufficient evidence to suggest that there is a difference in the proportions of peanuts in the mixed-nuts packages of the two companies. 25. The Excel template output is shown below.

Confidence Interval Estimate for the Difference in Population Proportions Confidence Level (decimal form) Sample 1 Proportion Sample 2 Proportion Sample 1 Size Sample 2 Size n1 • p1hat

0.9 0.50802 0.50813 374 738 190

n1 • q1hat

184

n2 • p2hat n2 • q2hat Are np and nq >=10?

375 363 yes

Upper Confidence Limit Lower Confidence Limit

0.05209 -‐0.0523

With 90% confidence, we estimate the interval (-0.0523, 0.0521) contains the true difference in the proportion of peanuts in the mixed-nuts packages of the two companies. This confidence interval is wider than the confidence interval that would correspond to the hypothesis test in the previous exercise. Since we failed to reject the hypothesis of “no difference” in that test, we would expect that narrower confidence interval to contain zero. The wider confidence interval for this exercise, then, would also contain zero. 26. You cannot use the Chi-square test on the weights of the different-coloured candies directly. The Chi-square test works with discrete qualitative data, and weights are continuous quantitative data. As well, it is important to use the correct counts for the test. Don't be lazy! Copyright © 2011 Pearson Canada Inc.

349

Instructor’s Solutions Manual - Chapter 12

If you convert the weights into an approximate number of candies, you can proceed, although you are only approximating. So, for example, if you knew each candy weighed 1.5 grams, you could convert the weights for each colour into a number of candies. Try it! Suppose the weight breakdown was as follows: Red Weight

Yellow 46

Green 58

Black 39

Orange 45

Total 250

Use Excel to do a Chi-square test for this data set. Then divide each of the weights by 1.5 grams to get the number of candies of each colour, and repeat. You will see that you do not get the same Chi-square statistic or p-value for the two versions. Only the second version, based on counts, is correct.

350

Instructor’s Solutions Manual - Chapter 13

Chapter 13 Solutions Develop Your Skills 13.1 1. The scatter diagram is shown below.

Hendrick Software Sales

y = 6.6519x + 4.7013

140

Total Sales ($000)

120 100 80 60 40 20

0 5

Number of Sales Contacts

The least-squares regression line is: total sales ($000) = 6.6519(number of sales contacts) + 4.7013 Interpretation: Each new sales contact results in an increase in sales of approximately $6,652. The y-intercept should not be interpreted, since the sample data did not contain any observations of 0 sales contacts. 2.

The equation of the least-squares regression line is monthly spending on restaurant meals = 0.024144(monthly income)+$44.90 Interpretation: Each new dollar in monthly income increases spending on restaurant meals by about 2.4¢.

351

Instructor’s Solutions Manual - Chapter 13

A scatter diagram is shown below.

Smith and Klein Manufacturing

y = 30.21x -‐ 148770

$1,600,000 $1,400,000 $1,200,000

Sales

$1,000,000 $800,000 $600,000 $400,000 $200,000 $0 $0

$10,000

$20,000

$30,000

$40,000

$50,000

Promotion Expenditure

The response variable is the semester average mark, and the explanatory variable is the total number of hours spent working during the semester. The relationship is unlikely to be positive. y = 0.1535x + 90.241 suggests that a student who worked no hours would get a mark of 90%, which seems a little high (but this intercept may not be reasonable to interpret this way, depending on the range of hours worked in the sample data). It also suggests that for each hour worked, the student’s mark would increase by 0.1535, which seems unlikely. It is more likely that the student's mark would decrease for each hour worked.

352

Instructor’s Solutions Manual - Chapter 13

Because of the way the researcher has posed the question, the response variable is revenues, and the explanatory variable is the number of employees. The scatter diagram is shown below:

Global Research Revenues (US$ Millions)

Top 25 Global Research Organizations, 2007

y = 0.1338x + 140.56

5,000 4,500 4,000 3,500 3,000 2,500 2,000

1,500 1,000 500 0 0

5,000

10,000 15,000 20,000 25,000 30,000 35,000 Full-‐Time Employees

The least-squares regression line is: revenue (US$millions) = 0.1338(number of full-time employees) + $140.56 US million Interpretation: Each additional employee results in increased revenue of US$0.1338 million (or US$133,800). The y-intercept should not be interpreted, since the sample data did not contain any observations of 0 employees.

353

Instructor’s Solutions Manual - Chapter 13

Develop Your Skills 13.2 6. The scatter diagram showed an apparently linear relationship between software sales and the number of sales contacts (see Develop Your Skills 13.1, Exercise 1).

Number of Sales Contacts Residual Plot 20 15

Residuals

10 5 0 -‐5 0

-‐10 -‐15 -‐20

Number of Sales Contacts

The residual plot shows residuals centred on zero, with fairly constant variability. There is no indication that the error terms are not independent. The data were collected over a random sample of months, but the dates of collection are not included, so it is not possible to check for independence of the residuals over time. A histogram of the residuals appears to be approximately normal.

Frequency

Hendrick Software Sales Residuals 9 8 7 6 5 4 3 2 1 0

Residual

354

Instructor’s Solutions Manual - Chapter 13

A check of the scatter diagram and the standardized residuals does not reveal any outliers. There are no obvious influential observations. It appears that the sample data meet the requirements of the theoretical model. 7.

The scatter diagram does not contain much of a pattern, but if there is a relationship, it appears to be linear.

Monthly Spending on Restaurant Meals

Spending on Restaurant Meals and Income y = 0.0241x + 44.903 $250 $200 $150 $100

$50 $0 $1,000 $1,500 $2,000 $2,500 $3,000 $3,500 $4,000 $4,500 Monthly Income

Monthly Income Residual Plot 150

Residuals

100

50 0 -‐50 -‐100

-‐150 $-‐

$1,000

$2,000

$3,000

$4,000

$5,000

Monthly Income

The residual plot shows a fairly constant variability, although the residuals appear to be a little larger on the positive side (except in the area of monthly incomes of around $3,500). There is no obvious dependence among the residuals. Copyright © 2011 Pearson Canada Inc.

355

Instructor’s Solutions Manual - Chapter 13

A histogram of the residuals appears to be approximately normal.

Residuals for Model of Restaurant Spending and Monthly Income

Frequency

25 20 15 10 5 0 Residual

A check of the scatter diagram and the standardized residuals reveals six points that could be considered outliers. They are circled on the scatter diagram below.

Monthly Spending on Restaurant Meals

Spending on Restaurant Meals and Income $250 $200 $150 $100

$50 $0 $1,000 $1,500 $2,000 $2,500 $3,000 $3,500 $4,000 $4,500 Monthly Income

356

Instructor’s Solutions Manual - Chapter 13

The presence of so many outliers is a cause for concern. [If we had access to the original data set, we would check to see that these observations were accurate.] These outliers obviously increase the variability of the error terms. Even if the data points identified as outliers are correct, they are an indication that the model will probably not be very useful for prediction purposes. There are two points in the data set that may be influential observations. They are indicated in the scatter diagram below.

Monthly Spending on Restaurant Meals

Spending on Restaurant Meals and Income $250 $200 $150 $100

$50 $0 $1,000 $1,500 $2,000 $2,500 $3,000 $3,500 $4,000 $4,500 Monthly Income

To investigate, each point is removed from the data set, to see the effect on the leastsquares regression line. The least-squares line for the original sample data set was y = 0.0241x + 44.903. Without the circled point on the right-hand side, the equation changes to y = 0.0214x + 50.639, which is not that much of a change, relatively speaking. Similarly, the outlier at (1258.97, 154.68) could be having a large effect on the leastsquares line. Removing it changes the equation to y = 0.0262x + 39.292, which has more of an effect. Still, neither point appears to be affecting the regression relationship by a large amount (relatively speaking).

357

Instructor’s Solutions Manual - Chapter 13

However, at this point in the analysis, it would be useful to go back to the beginning. It does not appear that monthly income is a strong predictor of monthly restaurant spending. There is too much variability in the restaurant spending data, for the various income levels, for us to develop a useful model. 8.

The scatter diagram shows the points arranged in a linear fashion. However, the scatter around the regression line appears to widen as the amount of promotional spending increases. This shows quite clearly in the residual plot.

Promotion Expenditure Residual Plot 300000

Residuals

200000 100000 0 -‐100000 -‐200000 -‐300000 $0

$10,000 $20,000 $30,000 $40,000 $50,000 Promotion Expenditure

At this point, it is clear that the data do meet the requirements of the theoretical model. [For completeness, we will continue to check the other requirements.]

358

Instructor’s Solutions Manual - Chapter 13

This is time-series data, and so the residuals should be plotted against time. The resulting plot shows a definite pattern over time, with the residuals widening in more recent years. This again indicates a problem; the current model does not meet the requirements of the theoretical model.

2010

2008

2006

2004

2002

2000

1998

1996

1994

1992

1990

1988

1986

1984

1982

250000 200000 150000 100000 50000 0 -‐50000 -‐100000 -‐150000 -‐200000 -‐250000

1980

Residual

Residuals Over Time, Smith and Klein Manufacturing

At this point, it is clear that the model should be re-specified. Introducing time as an explanatory variable would probably be of interest.

359

Instructor’s Solutions Manual - Chapter 13

With the two erroneous data points removed, the scatter diagram looks as shown below.

Semester Average Mark

Hours of Work and Semester Marks y = -‐0.144x + 89.175

100 90 80 70 60 50 40 30 20 10 0 0

100

200

300

400

Total Hours at Paid Job During Semester

The relationship appears to be linear. The residual plot is shown below.

Total Hours at Paid Job During Semester Residual Plot 15 10

Residuals

5 0 -‐5 0

100

200

300

400

-‐10 -‐15 -‐20

Total Hours at Paid Job During Semester

The residuals appear centred on zero, with fairly constant variability, although variability seems greatest in the middle of the range of hours worked.

360

Instructor’s Solutions Manual - Chapter 13

There is no indication that the residuals are dependent. A histogram of the residuals is shown below.

Residuals for Semester Mark and Hours of Work Data

Frequency

12 10

8 6 4

2 0

Residual

361

Instructor’s Solutions Manual - Chapter 13

10. The relationship between revenues and number of employees appears to be linear. The residual plot is shown below.

Full-‐Time Employees Residual Plot 1200 1000 800

Residuals

600 400

200 0 -‐200 0

5000

10000

15000

20000

25000

30000

35000

-‐400 -‐600

Full-‐Time Employees

The residuals do not appear to be centred on zero, and the variability is not constant. At this point, it appears that this sample data set does not appear to meet the requirements of the theoretical model. A histogram of the residuals is shown below.

Residuals for Top 25 Global Research Organizations 16 14 Frequency

12 10 8 6 4 2 0 Residuals

362

Instructor’s Solutions Manual - Chapter 13

The histogram of residuals confirms what we saw in the residual plot. The residuals are highly skewed to the right. There is one observation with a standardized residual of 3.8. The corresponding point is circled on the residual plot above. Develop Your Skills 13.3 11. Since the sample data meet the requirements, it is acceptable to proceed with the hypothesis test. H0: β1 = 0 (that is, there is no linear relationship between the number of sales contacts and sales) H1: β1 > 0 (that is, there is a positive linear relationship between the number of sales contacts and sales) α = 0.05 From the Excel output, t = 7.64 The p-value is 9.38E-08, which is very small. The p-value for the one-tailed test is only half of this value, and is certainly < α. In other words, there is almost no chance of getting sample results like these, if in fact there is no linear relationship between the number of sales contacts and sales. Therefore, we can (with confidence), reject the null hypothesis and conclude there is evidence of a positive linear relationship between the number of sales contacts and sales data for the Hendrick Software Sales Company. 12. We already expect that the model will not be particularly useful. The number of data points with standardized residuals either ≥ +2 or ≤ -2 are a concern. However, the hypothesis test provides some evidence that there is a linear relationship between monthly income and monthly spending on restaurant meals. H0: β1 = 0 (that is, there is no linear relationship between monthly income and monthly spending on restaurant meals) H1: β1 > 0 (that is, there is a positive linear relationship between monthly income and monthly spending on restaurant meals) α = 0.05 From the Excel output, t = 4.6. The p-value is on the output is 1.338E-05, and the pvalue for the one-tailed test is half of this. Reject H0 and conclude there is evidence of a positive linear relationship between monthly income and monthly spending on restaurant meals. 13. Since the sample data do not meet the requirements of the theoretical model, it is not appropriate to conduct a hypothesis test.

363

Instructor’s Solutions Manual - Chapter 13

14. Since the sample data meet the requirements, it is acceptable to proceed with the hypothesis test. H0: β1 = 0 (that is, there is no linear relationship between the number of hours worked during the semester and the semester average grade) H1: β1 < 0 (that is, there is a negative linear relationship between the number of hours worked during the semester and the semester average grade) α = 0.05 From the Excel output, t = -10.01 The p-value is 2.47086E-12, which is very small. The p-value for the one-tailed test is only half of this value, and is certainly < α. In other words, there is almost no chance of getting sample results like these, if in fact there is no linear relationship between the number of hours worked during the semester and the semester average grade. Therefore, we can (with confidence), reject the null hypothesis and conclude there is evidence of a negative linear relationship between the number of hours worked during the semester and the semester average grade. 15. Since the sample data do not meet the requirements of the theoretical model, it is not appropriate to conduct a hypothesis test. Develop Your Skills 13.4 16. From the Excel output, R2 = 0.72. This means that 72% of the variation in sales is explained by the number of sales contacts. This suggests a fairly strong linear association between the two variables, which is not surprising. Assuming the original data was collected correctly, it is possible that the other factors affecting sales have been randomized. In such a case, it would seem reasonable to conclude that increasing sales contacts would lead to increased sales. However, there will likely be limits to the positive impact that could be created. Presumably, salespeople contact their best prospective clients first, so additional contacts may not be as productive. As well, increasing the number of contacts may reduce the quantity of time spent with each contact, which could have a detrimental effect on sales. 17. The R2 value for this data set is only 0.18. This is not surprising, because the scatter diagram of the relationship revealed scarcely any perceivable pattern. Only 18% of the variation in monthly spending on restaurant meals is explained by income. Earlier investigations suggested this model was not worth pursuing, and the low R2 value reinforces that. 18. The R2 value is fairly high, at 0.83. This means that 83% of the variation in Smith and Klein’s sales is explained by sales promotion spending. However, while there is a strong association between the two variables, the linear regression model is not a good one.

364

Instructor’s Solutions Manual - Chapter 13

19. The R2 value, at 0.72, suggests that 72% of the variation in semester average marks is explained by hours spent working during the semester. (Note that this is for the amended data set, where the two erroneous grades have been removed—see Develop Your Skills 13.2, Exercise 9). Obviously, there are many factors that affect semester average marks, for example, ability, study habits, past educational experience, and so on. If the original data were collected in a truly random fashion, these factors may have been randomized. It seems reasonable to conclude that students who work less will have more time for their studies, and it seems reasonable to think that marks improve with time spent studying. However, this data set does not guarantee that reducing work will lead to improved marks. 20. The R2 value is 0.93. Notice that this value looks very promising. Remember, though, that the model did not meet the requirements of the theoretical model. Remember, a high R2 value does not guarantee a cause-and-effect relationship, or a useful model. Develop Your Skills 13.5 21. Since the requirements are met, it is appropriate to create a confidence interval. The Excel output is shown below (in two parts, to better fit on the page). Confidence Interval and Prediction Intervals -‐ Calculations Point 98% = Confidence Level (%) Number Number of Sales Contacts 1 10

Prediction Interval Confidence Interval Lower limit Upper limit Lower limit Upper limit 44.96826 97.471443 66.068659 76.37104

With 98% confidence, the interval ($66,069, $76,371) contains the average sales for 10 sales contacts. 22. We have already established this is not a good model. However, even if it were a good model, we would not use it to predict monthly spending on restaurant meals based on a monthly income of $6,000. The highest monthly income in the sample data set is $4,056, and so we should not rely on our model to make predictions for a monthly income of $6,000. 23. Since the requirements are not met, it is not appropriate to create a confidence interval.

365

Instructor’s Solutions Manual - Chapter 13

24. The Excel output is shown below (note that this is for the amended data set, where the two erroneous grades have been removed—see Develop Your Skills 13.2, Exercise 9). Confidence Interval and Prediction Intervals -‐ Calculations Point 95% = Confidence Level (%) Number Total Hours at Paid Job During Semester 1 200

Prediction Interval Confidence Interval Lower limit Upper limit Lower limit Upper limit 46.027952 74.74128231 58.1586403 62.61059452

With 95% confidence, the interval (58.2, 62.6) contains the average semester average mark, when students work 200 hours in paid employment during the semester. 25. Since the requirements are not met, it is not appropriate to construct a prediction interval. Chapter Review Exercises 1. The hypothesis test is only valid if the required conditions are met. If you don't check conditions, you may rely on a hypothesis test when it is misleading. 2.

Regression prediction intervals are wider than confidence intervals because the interval has to account for the distribution of y-values around the regression line. The regression confidence interval has to take into account only that the sample regression line may not match the true population regression line.

A lower standard error means that confidence and prediction intervals will be narrower. Predictions made with the model will therefore be more useful.

You should not make predictions outside the range of the sample data on which the regression relationship is based because the relationship may be very different there. For example, a linear model may provide a good approximation of a portion of a relationship that is actually a curved line. However, if the line is extended beyond this portion, it could be quite misleading.

It is always tempting to just remove problem data points. However, if you do this, you will often find that the remaining data points also have outliers. If you persist in the practice of removing troublesome data points, you may not have much data left! Careful thinking is a better approach. The outlier may be telling you something really important about the actual relationship between the explanatory and response variables. You wouldn't want to miss this important clue to what is really going on.

366

Instructor’s Solutions Manual - Chapter 13

The scatter diagram is shown below.

List Price and Odometer Reading for 2006 Honda Civic Sedan (as of Fall 2008)

y = -‐0.0374x + 18017

$22,000

List Price

$20,000 $18,000 $16,000 $14,000 $12,000 $10,000 0

20,000

40,000

60,000

80,000

100,000 120,000

Odometer Reading

The relationship is: $list price = -0.0374 (odometer reading in kilometers) + $18,017 For this small car, the base asking price is $18,017, which is reduced by about 3.7¢ for every kilometer on the odometer. However, note that this “base asking price” should not be trusted for any cars with fewer than 8,600 kilometres, since no cars in the data set had odometer readings below that.

367

Instructor’s Solutions Manual - Chapter 13

We have already examined the scatter diagram, which suggests a negative linear relationship. The residual plot is shown below. It has the desired appearance of constant variability, with the residuals centred on zero.

Odometer Residual Plot

3000 2000

Residuals

1000 0 -‐1000

20000

40000

60000

80000

100000

120000

-‐2000 -‐3000 -‐4000

Odometer

A histogram of the residuals is shown below. The histogram is not perfectly normally-distributed, but it is approximately so.

Residuals for Honda Civic List Price Model, Based on Odometer 8 7 Frequency

6 5 4 3 2 1 0 Residual

368

Instructor’s Solutions Manual - Chapter 13

There are no standardized residuals ≥ +2 or ≤ -2. It appears the sample data meet the requirements of the theoretical model, and so it would be appropriate to use odometer readings to predict the list prices of these used cars. A 95% prediction interval for the list price for one of these cars with 50,000 kilometres on the odometer is ($12,683, $19,608). The Excel output is shown below. Confidence Interval and Prediction Intervals -‐ Calculations Point 95% = Confidence Level (%) Number Odometer 1 50000

Prediction Interval Confidence Interval Lower limit Upper limit Lower limit Upper limit 12683.4909 19607.9242 15259.8312 17031.584

A scatter diagram showing the two stock market indexes is shown below. Note that the data used are the "adjusted close" figures. You must take care to match the dates—there are a few instances when one market is open and the other is not. Observations that did not have a match were removed from the data set.

TSX and DJI, January -‐ June, 2009

y = 1.2553x -‐ 894.84

S&P/TSX Composite Index

11,000 10,500 10,000 9,500 9,000 8,500 8,000 7,500 7,000 6,000

6,500

7,000

7,500

8,000

8,500

9,000

9,500

Dow Jones Industrial Average

The estimated relationship is as follows: TSX Composite Index = 1.255 (DJI) – 895

369

Instructor’s Solutions Manual - Chapter 13

Note that the choice of variable on the x or y axis is somewhat arbitrary here. Because Canada's economy is so dependent on exports to the US, the DJI is placed as the "explanatory" variable, but the cause and effect is not direct. 9.

The coefficient of determination for the TSX and the DJI over the first six months of 2009 is 0.72. This measure suggests that 72% of the variation in the TSX is explained by variation in the DJI.

10. This data set is not a random sample, because it includes all matched observations over the period studied. Could this be considered a random sample? Probably not. The credit crisis and the recession that were having impacts on the stock markets in the first six months of 2009 made this period unreliable as a model of how the two indexes behave during more normal times. However, it is interesting to examine the patterns in the indexes over the period. The indexes were more closely related at the beginning of 2009 than they were later in the period. A time-series plot reveals this quite clearly.

TSX and DJI, January -‐ June 2009 11,000 10,500

10,000 Index Values

9,500 9,000 8,500 8,000

DJI

7,500

TSX

7,000 6,500 19-‐Jun-‐09

05-‐Jun-‐09

22-‐May-‐09

08-‐May-‐09

24-‐Apr-‐09

10-‐Apr-‐09

27-‐Mar-‐09

13-‐Mar-‐09

27-‐Feb-‐09

13-‐Feb-‐09

30-‐Jan-‐09

16-‐Jan-‐09

02-‐Jan-‐09

6,000

The required conditions are not met (as we might expect, given the graph above).

370

Instructor’s Solutions Manual - Chapter 13

The residual plot clearly shows non-constant variability.

DJI Residual Plot 1000

Residuals

500 0 -‐500 -‐1000 -‐1500 6,000

6,500

7,000

7,500

8,000

8,500

9,000

9,500

DJI

As well, the histogram of residuals shows marked negative skewness.

Residuals, TSX and DJI Data, January -‐ June 2009 40 35 Frequency

30 25 20 15 10 5 0 Residual

371

Instructor’s Solutions Manual - Chapter 13

A plot of the residuals over time clearly shows a time-related pattern.

Residuals Over Time, TSX and DJI Data, January -‐ June 2009 1000

Residuals

500 0 -‐500 -‐1000

19-‐Jun-‐09

05-‐Jun-‐09

22-‐May-‐09

08-‐May-‐09

24-‐Apr-‐09

10-‐Apr-‐09

27-‐Mar-‐09

13-‐Mar-‐09

27-‐Feb-‐09

13-‐Feb-‐09

30-‐Jan-‐09

16-‐Jan-‐09

02-‐Jan-‐09

-‐1500

372

Instructor’s Solutions Manual - Chapter 13

11. A scatter diagram is shown below.

Mark on Final Exam

Student Marks in Statistics y = 0.9586x + 0.4464

100 90 80 70 60 50 40 30 20 10 0 0

100

Mark on Test #2

The estimated relationship is as follows: Mark on final exam = 0.9586 (Mark on Test #2) + 0.4464 In other words, it appears the mark on the final exam is about 96% of the mark on Test #2.

373

Instructor’s Solutions Manual - Chapter 13

12. The residual plot has the desired appearance.

Mark on Test #2 Residual Plot 10

Residuals

5 0 -‐5

100

-‐10 -‐15

Mark on Test #2

A histogram of the residuals appears approximately normally-distributed.

Residuals for Final Exam Marks Prediction Model

Frequency

6 4 2 0 Residual

There are no obvious influential observations or outliers. It appears that the sample data conform to the requirements of the theoretical model.

374

Instructor’s Solutions Manual - Chapter 13

13. Since the sample data meet the requirements, it is acceptable to proceed with the hypothesis test. H0: β1 = 0 (that is, there is no linear relationship between the mark on Test #2 and the final exam mark in Statistics) H1: β1 > 0 (that is, there is a positive linear relationship between the mark on Test #2 and the final exam mark in Statistics) α = 0.05 From the Excel output, t = 16.5 The p-value is 2.96E-14, which is very small. The p-value for the one-tailed test is only half of this value, and is certainly < 5%. In other words, there is almost no chance of getting sample results like these, if in fact there is no linear relationship between the mark on Test #2 and the final exam mark in Statistics. Therefore, reject H0 and conclude there is strong evidence of a positive linear relationship between the mark on Test #2 and the final exam mark in Statistics. 14a. The Excel output is shown below. Prediction Interval

Confidence Interval

Lower limit Upper limit Lower limit Upper limit 51.78719489 73.7293732 60.5028627 65.013705

The 95% confidence interval estimate for the average exam mark of students who had a mark of 65% on the second test in the Statistics course is (60.5, 65.0).

The 95% prediction interval estimate for the exam mark of a student who had a mark of 65% on the second test in the Statistics course is (51.8, 73.7). This interval is wider, because it has to take into the account the variability in individual marks of the students. The regression prediction interval is always wider than the confidence interval. The prediction interval has to take account of the distribution of exam marks around the regression line.

375

Instructor’s Solutions Manual - Chapter 13

15. A scatter diagram of the data is shown below.

Aries Car Parts $1,000

y = 0.9806x + 25.233

Auditor's Inventory Value

$900

$800 $700 $600 $500

$400 $300 $200 $100

$-‐ $-‐

$200

$400

$600

$800

$1,000

Recorded Parts Inventory Value

376

Instructor’s Solutions Manual - Chapter 13

16. As the scatter diagram created for Exercise 15 indicates, there appears to be a fairly strong positive linear relationship between the recorded and audited inventory values. The residual plot is shown below.

Recorded Parts Inventory Value Residual Plot 80

Residuals

60 40 20 0 -‐20 -‐40 -‐60 $-‐

$200

$400

$600

$800

$1,000

Recorded Parts Inventory Value

The residual plot shows residuals fairly randomly distributed around zero, with about the same variability for all x-values. There are two residuals that show unusual variability. They are circled in the plot. The data were all collected at about the same point in time, so there is no need to check residuals against time. A review of the standardized residuals reveals two outliers, observation #1 and observation #25 (these are the two points that are circled in the residual plot). Since the auditor has realized that he misread the written records for both data points, we will amend the data, and re-do the analysis.

377

Instructor’s Solutions Manual - Chapter 13

The new scatter diagram is as shown below.

Aries Car Parts $1,000

y = 0.9783x + 25.227

Auditor's Inventory Value

$900

$800 $700 $600 $500

$400 $300 $200 $100

$-‐ $-‐

$200

$400

$600

$800

$1,000

Recorded Parts Inventory Value

The new regression relationship is as follows: audited inventory value = 0.9783(recorded inventory value) + $25.23

378

Instructor’s Solutions Manual - Chapter 13

The residual plot for the amended data plot is shown below.

Residuals

Recorded Parts Inventory Value Residual Plot 40 30 20 10 0 -‐10 -‐20 -‐30 -‐40 $-‐

$200

$400

$600

$800

$1,000

Recorded Parts Inventory Value

The residual plot for the amended data set looks acceptable. A histogram of the residuals for the amended data set is shown below.

Frequency

Residuals for Aries CarParts Model 9 8 7 6 5 4 3 2 1 0 Residual

The histogram of residuals shows some positive skewness, and this is a cause for concern, suggesting caution in the use of the model.

379

Instructor’s Solutions Manual - Chapter 13

A check of the standardized residuals does not reveal any outliers. There are no obviously influential observations. It appears the corrected data set meets the requirements for the linear regression model, although the distribution of the residuals is not as normal in shape as is desired. 17. While we have some concern about the distribution of residuals, we will proceed with the hypothesis test. H0: β1 = 0 (that is, there is no linear relationship between the recorded inventory values and the audited inventory values) H1: β1 ≠ 0 (that is, there is a linear relationship between the recorded inventory values and the audited inventory values) α = 0.05 An excerpt of Excel’s regression output is shown below. SUMMARY OUTPUT Regression Statistics Multiple R 0.995213711 R Square 0.99045033 Adjusted R Square 0.990160946 Standard Error 16.61634358 Observations 35 ANOVA df

SS 944994.372 9111.394836 954105.7668

MS F 944994.372 3422.616936 276.1028738

Intercept

Coefficients Standard Error 25.22708893 8.612571593

t Stat P-‐value 2.929100636 0.006122286

Recorded Parts Inventory Value

0.978281557

58.50313612 6.47389E-‐35

Regression Residual Total

1 33 34

0.016721865

From the Excel output, t = 58.503. The p-value is 6.47389E-35, which is very small, and certainly < 5%. In other words, there is almost no chance of getting sample results like these, if in fact there is no linear relationship between the recorded inventory values and the audited inventory values. Therefore, reject the null hypothesis and conclude there is evidence of a linear relationship between the recorded and audited inventory values.

380

Instructor’s Solutions Manual - Chapter 13

18. The coefficient of determination for the amended (corrected) data on actual and recorded inventory values for Aries Car Parts is 0.9905. This means that a little over 99% of the variation in the audited inventory values is explained by differences in the recorded inventory values. Such a strong relationship suggests confidence in the recorded inventory values. 19. The scatter diagram for these data is shown below.

Revenue and Profit for a Random Sample of Top 1000 Canadian Companies, 2008

y = 5.8784x + 478280

$35,000,000 $30,000,000

Profit (000)

$25,000,000 $20,000,000 $15,000,000 $10,000,000 $5,000,000 $0 -‐$5,000,000

-‐$1,000,000

$1,000,000 $2,000,000 $3,000,000 $4,000,000 $5,000,000 Revenue (000)

Notice that the trendline is greatly influenced by the three data points from the three largest organizations in the data set. If we remove these observations, the scatter diagram looks as shown on the next page.

381

Instructor’s Solutions Manual - Chapter 13

Revenue and Profit for a Random Sample of Top 1000 Canadian Companies, 2008

y = -‐1.3658x + 250301

$2,500,000 $2,000,000

Profit (000)

$1,500,000

$1,000,000 $500,000

$150,000

$100,000

$50,000

-‐$50,000

-‐$100,000

-‐$150,000

-‐$200,000

-‐$250,000

-‐$300,000

-‐$500,000

Revenue (000)

The coefficient of determination for the full data set is 0.88, which is quite high. However, the measure is misleading. When the three largest data points are removed, the coefficient of determination is only 0.04, which seems more appropriate. The initial high value of the coefficient of determination never guarantees that a relationship is a good model, and it certainly does not, in this case.

382

Instructor’s Solutions Manual - Chapter 13

20. A scatter diagram for the data is shown below.

Score on Test Given During Job Interview

Performance of Graduates on Test Given During Job Interview

y = 0.6421x + 4.9775 R² = 0.7989

70 65 60 55 50 45 40

35 30 50

100

Finall Overall Average Grade

It appears there is a positive linear relationship between the final overall average grade and the score on the test given during the job interview. The regression relationship is as follows: score on test given during job interview = 0.6421(final overall average grade) + 4.98 This is promising. Since the grades are marked out of 100, and the test scores are out of 70, the slope would be 0.70 if the relationship was perfect.

383

Instructor’s Solutions Manual - Chapter 13

21. As discussed in Exercise 20 above, there appears to be a positive linear relationship between the final overall average grade and the score on the test given during the job interview. The residual plot is shown below.

Final Average Mark Residual Plot 8 6

Residuals

4 2 0 -‐2

100

-‐4 -‐6 -‐8

Final Average Mark

The residuals appear randomly distributed around zero, with the same variability for all x-values. A histogram of the residuals is shown below.

Residuals for Test Score Model 12

Frequency

10 8 6

4 2 0 Residual

The residuals appear approximately normally distributed.

384

Instructor’s Solutions Manual - Chapter 13

There are no outliers or obviously influential observations in the data set. It appears these data meet the requirements for the linear regression model. 22. Since the requirements are met, it is appropriate to test for a positive linear relationship. H0: β1 = 0 (that is, there is no linear relationship between the final average mark and the score on the test given during the job interview) H1: β1 > 0 (that is, there is a positive linear relationship between the final average mark and the score on the test given during the job interview) α is not given We are provided with only an excerpt of Excel output. However, we know that

b1 0.642105 = = 10.547 s b1 0.060878

We can approximate the p-value using a t-table, with n-2 = 28 degrees of freedom. Since t0.005 = 2.763, we know p-value is considerably less than 0.005. In other words, there is almost no chance of getting sample results like these, if in fact there is no linear relationship between the overall average mark of the graduate and the company test scores. Therefore, we can (with confidence), reject the null hypothesis and conclude there is evidence of a positive linear relationship. 23. Since the requirements are met, it is appropriate to create a confidence interval estimate. The Excel output is shown below. Point Number

98% = Confidence Level (%) Final Average Mark 1 75

Prediction Interval Confidence Interval Lower limit Upper limit Lower limit Upper limit 43.9917459 62.278853 51.4810585 54.78954

With 98% confidence, we estimate that the interval (51.5, 54.8) contains the average test score of graduates with an overall average mark of 75.

385

Instructor’s Solutions Manual - Chapter 13

24. Refer back to the output shown above in the solution to Exercise 23. With 98% confidence, we estimate that the interval (44.0, 62.3) contains the test score of a student with an overall average mark of 75. It is difficult to decide if the company should continue to administer its own test. The answer depends on how reliable a predictor of future performance the test has been, and what the costs of administering the tests have been. If the company test makes a major distinction between the predicted performance of someone with a test score of 44 and someone with a test score of 62, then the overall average grade may not be a good substitute. However, there is fairly strong relationship between the two variables. Perhaps the company could pilot using the overall average grade with a random sample of graduates, to see how well they do. 25. No, it would not be appropriate to use package weight as a predictor of shipping cost. We can see from the residual plot that variability increases as package weight increases. 26. It is often suggested that the Canadian stock market is very closely tied to the price of oil. A data set of weekly values for the Toronto Stock Exchange Composite Index (TSX) and the Canadian spot price of oil in dollars per barrel for the period from January 2000 to June 2009 was examined. The scatter diagram (shown below), suggests that while there may be a relationship between the two variables, it is not linear.

TSX and Canadian Oil Prices, January y = 76.584x + 6039.5 2007 -‐ June 2009 R² = 0.6902

S&P TSX Composite Index

16,000 14,000 12,000 10,000 8,000 6,000

4,000 $0

$20 $40 $60 $80 $100 $120 $140 Weekly Canadian Par Spot Price (Dollars per Barrel)

$160

386

Instructor’s Solutions Manual - Chapter 13

The non-linearity is evident in the residual analysis, as well.

Weekly Canadian Par Spot Price FOB (Dollars per Barrel) Residual Plot 4000

3000 2000

Residuals

1000 0 -‐1000

100

120

140

160

-‐2000 -‐3000 -‐4000 -‐5000

Weekly Canadian Par Spot Price FOB (Dollars per Barrel)

Residuals Over Time, TSX and Oil Price Model 4000 3000 2000 Residual

1000

0 -‐1000 -‐2000 -‐3000 -‐4000

03/03/2009

03/10/2008

03/05/2008

03/12/2007

03/07/2007

03/02/2007

03/09/2006

03/04/2006

03/11/2005

03/06/2005

03/01/2005

03/08/2004

03/03/2004

03/10/2003

03/05/2003

03/12/2002

03/07/2002

03/02/2002

03/09/2001

03/04/2001

03/11/2000

03/06/2000

03/01/2000

-‐5000

There appears to be a time-related pattern in the residuals. This is also apparent in the patterns of extreme residuals (those with standardized residuals either ≥ +2 or ≤ -2). They predictably occur in the period of August in 2000, January – July 2007, July 2008 and September-October 2008. While the model could probably be improved by the addition of a time variable, it is not clear how this could be used for predictive

387

Instructor’s Solutions Manual - Chapter 13

purposes. It would be probably be more useful to investigate what other explanatory variables were affecting the stock market over this period. As well, non-linear models could be explored.

388

Instructor's Solutions Manual – Chapter 14

Chapter 14 Solutions Develop Your Skills 14.1 1. Scatter diagrams are shown below.

Salary ($000)

Salary and Age 100 90 80 70 60 50 40 30 20 10 0 20

Age

Salary ($000)

Salary and Years of Postsecondary Education 100 90 80 70 60 50 40 30 20 10 0 0

Years of Postsecondary Education

Salary ($000)

Salary and Years of Experience 100 90 80 70 60 50 40 30 20 10 0 0

Years of Experience

389

Instructor's Solutions Manual – Chapter 14

All three scatter diagrams show the expected positive relationships. Salary appears to be linearly positively related to age, years of postsecondary education, and years of experience. We note that the variability in salary increases for older ages, and greater years of experience. Salary seems more strongly related to age for ages under about 40. Salary also appears to be more strongly related to years of experience under about 15. The variability of salary when plotted against years of postsecondary education is more variable than for the other explanatory variables, but also more constant. At this point, years of experience appears to be the strongest candidate as an explanatory variable. 2.

An excerpt of the Regression output is shown below. SUMMARY OUTPUT Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations

0.928389811 0.861907642 0.850399945 6.844780506 40

ANOVA df Regression Residual Total

Intercept Age Years of Postsecondary Education Years of Experience

3 36 39 Coefficients 27.70373012 -‐0.3191034 2.846348768 1.568477845

The model is as follows: Salary ($000) = 27.7 -0.3 (Age) + 2.8 (Years of Postsecondary Education) + 1.6 (Years of Experience) In other words, salary is $27,704 - $319 for each year of age + $2,846 for each year of postsecondary education + $1,568 for each year of experience. The coefficient for age does not seem appropriate, and points to problems with this model.

390

Instructor's Solutions Manual – Chapter 14

3. The scatter diagram for age and years of experience is shown below. Note that the age axis starts at 20, since there are no workers under 20 years of age.

Years of Experience

Age and Years of Experience 40 35 30 25 20 15 10 5 0 20

Age

The two variables are very closely related, as we would expect. It is not possible to acquire years of experience without also acquiring years of age. It does not make sense to include both explanatory variables in the model.

391

Instructor's Solutions Manual – Chapter 14

An excerpt of the Regression output for the salaries data set with years of postsecondary education and age as explanatory variables is shown below. SUMMARY OUTPUT Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations

0.902574999 0.814641629 0.804622258 7.822241019 40

ANOVA df Regression Residual Total

Intercept Age Years of Postsecondary Education

2 37 39 Coefficients -‐2.125320671 1.121515567 2.155278577

The model is as follows: Salary ($000) = -2.1 + 1.1 (Age) + 2.2 (Years of Postsecondary Education) In other words, Salary = -2,125 + $1,122 for each year of age + $2,155 for each year of postsecondary education

392

Instructor's Solutions Manual – Chapter 14

An excerpt of the Regression output for the salaries data set with years of postsecondary education and years of experience is shown below.

SUMMARY OUTPUT Regression Statistics Multiple R 0.92720353 R Square 0.859706386 Adjusted R Square 0.852122948 Standard Error 6.805249337 Observations 40 ANOVA df Regression Residual Total

Intercept Years of Postsecondary Education Years of Experience

2 37 39 Coefficients 20.87563476

2.673930095 1.238703002

The model is as follows: Salary ($000) = 20.9 +2.7 (Years of Postsecondary Education) + 1.2 (Years of Experience) In other words, Salary = $20,876 + $2,674 for every year of postsecondary education + $1,239 for every year of experience.

393

Instructor's Solutions Manual – Chapter 14

Develop Your Skills 14.2 6.

The residual plots are shown below.

Age Residual Plot 20 15

Residuals

10 5 0 -‐5 0

-‐10 -‐15

Age

Years of Postsecondary Education Residual Plot 20

Residuals

15 10 5 0 -‐5 0

-‐10 -‐15

Years of Postsecondary Education

Residuals

Years of Experience Residual Plot 20 15 10 5 0 -‐5 -‐10 -‐15

Years of Experience

394

Instructor's Solutions Manual – Chapter 14

All of these residual plots appear to have the horizontal band appearance that is desired, and so these plots appear to be consistent with the required conditions. A plot of the residuals vs. predicted salaries for this model is shown below.

Residuals vs. Predicted Salary

Residual

(Age, Years of Postsecondary Education, Years of Experience) 20 15 10 5 0 -‐5 -‐10 -‐15 0

100

Predicted Salary (000)

This residual plot also appears to have the desired horizontal band appearance, centred around zero. There is somewhat less variability in the residuals for predicted salaries under about $40,000, but it is not pronounced. 7.

The residual plots are shown below.

Age Residual Plot 20 15

Residuals

10 5 0

-‐5 0

-‐10 -‐15

-‐20

Age

395

Instructor's Solutions Manual – Chapter 14

Years of Postsecondary Education Residual Plot 20

Residuals

10 0

-‐10

-‐20

Years of Postsecondary Education

In the age residual plot, we see a a couple of points that are above and below the desired horizontal band. Also, the residuals appear to be centred somewhat above zero. The postsecondary education residual plot looks more like the desired horizontal band, although once again, the residuals appear to be centred above zero. A plot of the residuals vs. predicted salary is shown below.

Residuals vs. Predicted Salary

Residual

(Age, Years of Postsecondary Education) 20 15 10 5 0 -‐5 -‐10 -‐15 -‐20 0

100

Predicted Salary (000)

The plot appears to have the desired horizontal band appearance, with the residuals centred around zero. The two circled points correspond to observations with standardized residuals ≥ +2 or ≤ -2 (observations 29 and 40).

396

Instructor's Solutions Manual – Chapter 14

The residual plots are shown below.

Years of Postsecondary Education Residual Plot 15

Residuals

10 5 0 -‐5 0

-‐10 -‐15

Years of Postsecondary Education

Years of Experience Residual Plot 15

Residuals

10 5 0 -‐5

-‐10 -‐15

Years of Experience

Both appear to have the desired horizontal band appearance, centred on zero.

397

Instructor's Solutions Manual – Chapter 14

A plot of the residuals vs. predicted salaries is shown below.

Residuals vs. Predicted Salary (Years of Postsecondary Education, Years of Experience) 15 10

Residuals

5 0 -‐5 -‐10 -‐15 0

100

Predicted Salary

There appears to be somewhat less variability for lower predicted salaries. However, overall, the plot shows the desired horizontal band, centred around zero. 9.

A histogram of the residuals for the model discussed in Exercise 6 is shown below.

Residuals for Salary Model 14

(Age, Years of Postsecondary Education, Years of Experience)

Frequency

12 10 8 6 4 2 0 Residual

398

Instructor's Solutions Manual – Chapter 14

The histogram is somewhat skewed to the right. It appears to be centred close to zero. A histogram of the residuals for the model discussed in Exercise 7 is shown below.

Residuals for Salary Model 12

(Age, Years of Postsecondary Education)

Frequency

10 8 6

4 2

0 Residual

As for the previous model, we see the histogram is somewhat skewed to the right, and centred approximately on zero. A histogram of the residuals for the model discussed in Exercise 8 is shown below.

Residuals for Salary Model 14

(Years of Postsecondary Education, Years of Experience)

Frequency

12 10 8 6 4 2 0 Residual

399

Instructor's Solutions Manual – Chapter 14

10. For the model containing all explanatory variables: There is one set of observations that produces a standardized residual just slightly above 2. This is data point 26, where the observed salary is $67,400, age is 47, years of postsecondary education are 5, and years of experience are 17. The actual salary is above the predicted salary of $53,600. If we had access to the original records, we would double-check this data point. For the model containing years of postsecondary education and age as explanatory variables: There are two observations with standardized residuals ≥ +2 or ≤ -2 (observations 29 and 40). As mentioned in the answer to Exercise 7, these two points are obvious in the plot of residuals vs. predicted salary for this model. f we had access to the original records, we would double-check these data points. For the model containing years of postsecondary education and years of experience as explanatory variables: There are no observations with standardized residuals ≥ +2 or ≤ -2.

400

Instructor's Solutions Manual – Chapter 14

Develop Your Skills 14.3 11. Adjusted

12. Test for the significance of the overall model (all explanatory variables): H0: β1 = β2 = β3 = 0 H1: At least one of the βi’s is not zero. α = 0.05 From the Excel output, we see that F = 74.9, and the p-value is approximately zero. There is strong evidence that the overall model is significant. Tests for the significance of the individual explanatory variables: Age: H0: β1 = 0 H1: β1 ≠ 0

(from Excel output). The p-value is 0.5, so we fail to reject H0. There is not enough evidence to conclude that age is a significant explanatory variable for salaries, when years of postsecondary education and years of experience are included in the model. This is not surprising, given how closely related age and years of experience appear to be.

401

Instructor's Solutions Manual – Chapter 14

Years of postsecondary education: H0: β2 = 0 H1: β2 ≠ 0

(from Excel output). The p-value is approximately zero, so we reject H0. There is enough evidence to conclude that years of postsecondary education is a significant explanatory variable for salaries, when age and years of experience are included in the model. Years of experience: H0: β3 = 0 H1: β3 ≠ 0

(from Excel output). The p-value is approximately zero, so we reject H0. There is enough evidence to conclude that years of experience is a significant explanatory variable for salaries, when age and years of postsecondary education are included in the model. 13. Test for the significance of the overall model (age and years of postsecondary education ): H0: β1 = β2 = 0 H1: At least one of the βi’s is not zero. α = 0.05 From the Excel output, we see that F = 81.3, and the p-value is approximately zero. There is strong evidence that the overall model is significant.

402

Instructor's Solutions Manual – Chapter 14

Tests for the significance of the individual explanatory variables: Age: H0: β1 = 0 H1: β1 ≠ 0

(from Excel output). The p-value is approximately zero, so we reject H0. There is enough evidence to conclude that age is a significant explanatory variable for salaries, when years of postsecondary education are included in the model. Years of postsecondary education: H0: β2 = 0 H1: β2 ≠ 0

403

Instructor's Solutions Manual – Chapter 14

14. Test for the significance of the overall model (years of experience and years of postsecondary education ): H0: β1 = β2 = 0 H1: At least one of the βi’s is not zero. α = 0.05 From the Excel output, we see that F = 113.4, and the p-value is approximately zero. There is strong evidence that the overall model is significant. Tests for the significance of the individual explanatory variables: Years of postsecondary education: H0: β1 = 0 H1: β1 ≠ 0

(from Excel output). The p-value is approximately zero, so we reject H0. There is enough evidence to conclude that years of postsecondary education is a significant explanatory variable for salaries, when years of experience are included in the model. Years of experience: H0: β2 = 0 H1: β2 ≠ 0

404

Instructor's Solutions Manual – Chapter 14

15. The adjusted R2 values are shown below: Model Adjusted R2 All Explanatory Variables 0.85 Years of Postsecondary Education and Age 0.80 Years of Postsecondary Education and Years of Experience 0.85 At this point, the model that contains years of postsecondary education and age does not seem worth considering. The adjusted R2 value is lower than for the other models. As we have already seen, age and years of experience are highly correlated, and it appears that the model containing years of experience does a better job. Develop Your Skills 14.4 16. The Excel output from the Multiple Regression Tools add-in is shown below (in two parts, to fit better on the page): Confidence Interval and Prediction Intervals -‐ Calculations Point 95 = Confidence Level(%) Number Mortgage Rates Housing Starts Advertising Expenditure 1 6 3500 3500

Prediction Interval Confidence Interval Lower limit Upper limit Lower limit Upper limit 106933.6829 137638.2386 114521.388 130050.5335

With 95% confidence, the interval ($114,521.39, $130,050.53) contains average Woodbon sales when mortgage rates are 6%, housing starts are 3,500 and advertising expenditure is $3,500. 17. It would not be appropriate to use the Woodbon model to make a prediction for mortgage rates of 6%, housing starts of 2,500, and advertising expenditure of $4,000, because the highest advertising expenditure in the sample data is only $3,500. We should not rely on a model for predictions based on explanatory variable values that are outside the range of the sample data on which the model is based.

405

Instructor's Solutions Manual – Chapter 14

18. The Excel output from the Multiple Regression Tools add-in is shown below (in two parts, to fit better on the page): Confidence Interval and Prediction Intervals -‐ Calculations Point 95 = Confidence Level(%) Number Age Years of Postsecondary Education 1 35 5

Prediction Interval Lower limit Upper limit 31.68852578 64.1197083

With 95% confidence, the interval ($31,689, $64,120) contains the salary of an individual who is 35 years old, and who has 5 years of postsecondary education. 19. The Excel output from the Multiple Regression Tools add-in is shown below (in two parts, to fit better on the page): Confidence Interval and Prediction Intervals -‐ Calculations Point 95 = Confidence Level(%) Number Age Years of Postsecondary Education 1 35 5

Confidence Interval Lower limit Upper limit 44.47730973 51.330924

406

Instructor's Solutions Manual – Chapter 14

20. The Excel output from the Multiple Regression Tools add-in is shown below (in two parts, to fit better on the page):

Confidence Interval and Prediction Intervals -‐ Calculations Point 95 = Confidence Level(%) Number Years of Postsecondary Education Years of Experience 1 5 10

Prediction Interval Lower limit Upper limit 32.50852474 60.7561058

With 95% confidence, the interval ($32,509, $60,756) contains the salary of an individual who has 5 years of postsecondary education, and 10 years of experience. 21. The text contains scatter diagrams of Woodbon Annual Sales plotted against mortgage rates and advertising expenditure (see Exhibit 14.2). Each relationship appears linear, with no pronounced curvature. A plot of the residuals versus the predicted y-values for this model is shown below.

Woodbon Model, Residuals Versus Predicted Sales (Mortgage Rates and Advertising Expenditure as Explanatory Variables)

20000 15000

Residual

10000 5000 0 -‐5000 -‐10000

-‐15000 -‐20000 0

20000

40000

60000

80000 100000 120000 140000

Predicted Sales Values

407

Instructor's Solutions Manual – Chapter 14

The plot shows the desired horizontal band appearance, although there appears to be reduced variability for higher predicted values. The other residual plots are shown below.

The mortgage rates residual plot shows the desired horizontal band appearance.

Residuals

Advertising Expenditure Residual Plot 20000 15000 10000 5000 0 -‐5000 -‐10000 -‐15000 -‐20000 $0

$1,000

$2,000

$3,000

$4,000

Advertising Expenditure

The advertising expenditure residual plot shows decreased variability for higher advertising expenditures. This is a concern, because it appears to violate the required conditions. At this point, we will refrain from conducting the F-test, as the required conditions are not met. Copyright © 2011 Pearson Canada Inc.

408

Instructor's Solutions Manual – Chapter 14

22. The Excel Regression output for the model that includes per-capital income and population is shown below. SUMMARY OUTPUT Regression Statistics Multiple R 0.720727015 R Square 0.51944743 Adjusted R Square 0.483850943 Standard Error 687.5975486 Observations 30 ANOVA df Regression Residual Total

Intercept Population Per-‐Capita Income

2 27 29

SS MS F Significance F 13798538.88 6899269 14.59266 5.05245E-‐05 12765340.5 472790.4 26563879.38

Coefficients Standard Error t Stat P-‐value Lower 95% 25502.98998 1443.513376 17.6673 2.3E-‐16 22541.14522 0.062546304 0.011890908 5.260011 1.52E-‐05 0.038148175 0.024579975 0.034847903 0.70535 0.486634 -‐0.046922016

From this we can see that the model is significant. H0: β1 = β2 = 0 H1: At least one of the βi’s is not zero. α = 0.05 From the Excel output, we see that F = 14.6, and the p-value is approximately zero. There is strong evidence that the overall model is significant. However, only one of the explanatory variables is significant in this model.

409

Instructor's Solutions Manual – Chapter 14

3 Population: H0: β1 = 0 H1: β1 ≠ 0

(from Excel output). The p-value is approximately zero, so we reject H0. There is enough evidence to conclude that population is a significant explanatory variable for sales, when per-capita income is included in the model. Per-capita income: H0: β2 = 0 H1: β2 ≠ 0

(from Excel output). The p-value is 0.49, so we fail to reject H0. There is not enough evidence to conclude that per-capita income is a significant explanatory variable for sales, when population is included in the model.

410

Instructor's Solutions Manual – Chapter 14

We proceed with the analysis by creating all possible regressions. The output is shown below. Multiple Regression Tools-‐All Possible Models -‐ Calculations

Model Number Adjusted R^2 Standard Error K Significance F 1 0.493121976 681.4291273 1 9.17832E-‐06 Variable Labels Coefficients p-‐value Intercept 26434.2424 7.3224E-‐28 Population 0.063380025 9.17832E-‐06

Model Number Adjusted R^2 Standard Error K Significance F 2 -‐0.007751492 960.828081 1 0.385584016 Variable Labels Coefficients p-‐value Intercept 27795.7368 1.64199E-‐14 Per-‐Capita Income 0.042711388 0.385584016

Model Number Adjusted R^2 Standard Error K Significance F 3 0.483851936 687.6320544 2 5.05232E-‐05 Variable Labels Coefficients p-‐value Intercept 25503.11658 2.306E-‐16 Population 0.062550338 1.51515E-‐05 Per-‐Capita Income 0.024571325 0.486807201

It is clear, from this output, that the model for sales that we would want to explore first is the one with population as the explanatory variable. This model has the highest adjusted R2, the lowest standard error, and it is significant. Of course, once we have focused on this model, we must ensure that it meets the required conditions.

411

Instructor's Solutions Manual – Chapter 14

The scatter diagram for sales and population shows some evidence of a positive linear relationship.

Sales

Sales and Population $32,000 $31,500 $31,000 $30,500 $30,000 $29,500 $29,000 $28,500 $28,000 $27,500 20,000

30,000

40,000

50,000

60,000

70,000

Population

A plot of the residuals versus the predicted sales values for this model is shown below.

Residuals Versus Predicted Sales

Residual

(Model Based on Population) 2000 1500 1000 500 0 -‐500 -‐1000 -‐1500 -‐2000 28000

28500

29000

29500

30000

30500

31000

Predicted Sales

This plot shows, more or less, the desired horizontal band appearance. However, there are two points that raise questions, as they are far from the other points (the points are circled on the plot). Copyright © 2011 Pearson Canada Inc.

412

Instructor's Solutions Manual – Chapter 14

The plot of residuals versus population is shown below.

Residuals

Population Residual Plot 2000 1500 1000 500 0 -‐500 -‐1000 -‐1500 -‐2000

-‐

20,000

40,000

60,000

80,000

Population

As we might have expected, the same two points stand out in the plot. There are no dates associated with these data points, so we cannot assess whether they are related over time. A histogram of the residuals is shown below.

Residuals (Model Based on Population) 12

Frequency

10 8 6 4 2 0 Residual

413

Instructor's Solutions Manual – Chapter 14

There are two observations which produce a standardized residual ≥ +2 or ≥ -2 (observations 2 and 10). These are the same data points that stood out on the residual plots. If we had access to the original data, we would double-check these data points. Because we cannot do that, we will leave them in the model. Our analysis suggests that the model that predicts sales on the basis of population is the best model for this data set. Because the model appears to meet the required conditions, it could be used as the basis for predictions of sales. 23. Here is the correlation matrix for the variables in the Salaries data set.

Years of Postsecondary Education

Age Age Years of Postsecondary Education Years of Experience Salary (000)

Years of Experience

Salary (000)

0.318722486 0.971756454 0.861913715

1 0.227538151 0.528597263

1 0.862062768

414

Instructor's Solutions Manual – Chapter 14

24. We will use the Excel add-in to provide summary data about all possible models (notice that the output is spread over two pages). Model Number Adjusted R^2 Standard Error K Significance F 1 0.736129338 9.090529977 1 9.16507E-‐13 Variable Labels Coefficients p-‐value Intercept 0.207680673 0.967203503 Age 1.252388299 9.16507E-‐13

Model Number Adjusted R^2 Standard Error K Significance F 2 0.260452305 15.21867153 1 0.000454415 Variable Labels Coefficients p-‐value Intercept 36.37394928 5.56639E-‐10 Years of Postsecondary Education 4.03150375 0.000454415

Model Number Adjusted R^2 Standard Error K Significance F 3 0.736393063 9.085986075 1 8.99113E-‐13 Variable Labels Coefficients p-‐value Intercept 28.23279516 2.36687E-‐13 Years of Experience 1.365020143 8.99113E-‐13

Model Number Adjusted R^2 Standard Error K Significance F 4 0.804622258 7.822241019 2 2.87227E-‐14 Variable Labels Coefficients p-‐value Intercept -‐2.125320671 0.628936674 Age 1.121515567 1.8482E-‐12 Years of Postsecondary Education 2.155278577 0.000547032

415

Instructor's Solutions Manual – Chapter 14

Model Number Adjusted R^2 Standard Error K Significance F 5 0.740351949 9.017500671 2 5.53555E-‐12 Variable Labels Coefficients p-‐value Intercept 13.78392724 0.249310597 Age 0.631384586 0.216724787 Years of Experience 0.69640475 0.211309492

Model Number Adjusted R^2 Standard Error K Significance F 6 0.852122948 6.805249337 2 1.66036E-‐16 Variable Labels Coefficients p-‐value Intercept 20.87563476 9.19447E-‐11 Years of Postsecondary Education 2.673930095 2.60097E-‐06 Years of Experience 1.238703002 1.02872E-‐14

Model Number Adjusted R^2 Standard Error K Significance F 7 0.850399945 6.844780506 3 1.51907E-‐15 Variable Labels Coefficients p-‐value Intercept 27.70373012 0.005220235 Age -‐0.3191034 0.453660758 Years of Postsecondary Education 2.846348768 5.77326E-‐06 Years of Experience 1.568477845 0.001223366

416

Instructor's Solutions Manual – Chapter 14

25. There are many possible models here. However, the one that looks most promising is the one that includes years of experience and years of postsecondary education. This is a logical model. Overall, it is significant, and each of the explanatory variables is significant. The standard error is relatively low. As well, the model makes sense. It is reasonable to expect that both of these factors would have a positive impact on salary. We cannot decide to rely on this model without checking the required conditions. The residual plots are shown below.

Residuals vs. Predicted Salary (Years of Postsecondary Education, Years of Experience) 15 10

Residuals

5 0 -‐5 -‐10 -‐15 0

100

Predicted Salary

Years of Postsecondary Education Residual Plot 15

Residuals

10 5 0 -‐5 0

-‐10 -‐15

Years of Postsecondary Education

417

Instructor's Solutions Manual – Chapter 14

Years of Experience Residual Plot 15

Residuals

10 5 0 -‐5

-‐10 -‐15

Years of Experience

All the residual plots show the desired horizontal band appearance, centred on zero. A histogram of the residuals has some right-skewness.

Residuals for Salary Model (Years of Postsecondary Education, Years of Experience) 14

Frequency

12 10 8 6 4 2 0 Residual

There are no obvious outliers or influential observations. We choose this model as the best available.

418

Instructor's Solutions Manual – Chapter 14

26. The Excel Regression output for the model based on income only is shown below.

SUMMARY OUTPUT Regression Statistics Multiple R 0.604253286 R Square 0.365122033 Adjusted R Square 0.345883307 Standard Error 422.1512823 Observations 35 ANOVA df Regression Residual Total

Intercept Income ($000)

1 33 34

SS MS F Significance F 3382189.615 3382190 18.97849 0.000121079 5880986.271 178211.7 9263175.886

Coefficients Standard Error t Stat P-‐value Lower 95% 75.02173946 396.7061107 0.189112 0.851164 -‐732.0829074 23.17297158 5.319255679 4.356431 0.000121 12.35086458

Compare these results with those shown in Exhibit 14.26, where gender is included in the model. We see that the adjusted R2 is higher for the model that includes gender, and the standard error is lower. It appears that adding the gender variable improves the model.

419

Instructor's Solutions Manual – Chapter 14

27. We used indicator variables as shown in Exhibit 14.27 in the text. The Excel Regression output is as shown below.

SUMMARY OUTPUT Regression Statistics Multiple R 0.404030563 R Square 0.163240695 Adjusted R Square 0.101258525 Standard Error 313.2587914 Observations 30 ANOVA df Regression Residual Total

Intercept Onever Durible

2 27 29

SS 516890.0667 2649538.9 3166428.967

MS F Significance F 258445.0333 2.633671806 0.090179418 98131.07037

Coefficients Standard Error t Stat P-‐value Lower 95% 1564.4 99.06112778 15.79226923 3.67865E-‐15 1361.143357 -‐218.9 140.0935904 -‐1.562526875 0.12981017 -‐506.3483007 -‐313.4 140.0935904 -‐2.237075937 0.03373543 -‐600.8483007

H0: β1 = β2 = 0 H1: At least one of the βi’s is not zero. α = 0.05 From the Excel output, we see that F = 2.63, and the p-value is about 9%. There is not enough evidence to infer that there is a significant relationship between battery life and brand. Note that this is the same conclusion we came to in Chapter 11.

420

Instructor's Solutions Manual – Chapter 14

28. First, we must set up the data set with indicator variables. We cannot run the regression with the 1, 2, and 3 codes for region that are in the data set. Two indicator variables (in combination) indicate region, as follows: Central 1 0 North 0 1 Southwest 0 0

Excel's Regression output is shown below.

SUMMARY OUTPUT Regression Statistics Multiple R 0.692380383 R Square 0.479390595 Adjusted R Square 0.429009039 Standard Error 12.12223863 Observations 35 ANOVA df Regression Residual Total

Intercept Number of Sales Contacts (Monthly) Region Indicator 1 Region Indicator 2

3 31 34

SS MS 4194.738105 1398.246 4555.408752 146.9487 8750.146857

F Significance F 9.5152 0.000131176

Coefficients Standard Error t Stat P-‐value Lower 95% -‐14.78132092 11.7183791 -‐1.26138 0.216582 -‐38.68111257 0.827088142 14.07075014 6.169658282

0.174906441 4.728746 4.67E-‐05 0.470364104 5.122616397 2.74679 0.009933 3.623105161 5.449944445 1.132059 0.266291 -‐4.945576653

We can see that the overall model is significant. As well, the number of sales contacts is significant. The first indicator variable is also significant, but the second one is not.

421

Instructor's Solutions Manual – Chapter 14

What does this mean? It appears that when the first region indicator variable is included in the model, the second one is not significant. If we think about what the region indicator variables tell us, it appears that region is significant, but only in the sense that it matters whether the region is central, or not (the distinction between north and southwest is not significant). In fact, if we re-run the model, keeping only the distinction between sales in the central region or not, the results are as follows:

SUMMARY OUTPUT Regression Statistics Multiple R 0.67665967 R Square 0.457868309 Adjusted R Square 0.423985079 Standard Error 12.17545162 Observations 35 ANOVA df Regression Residual Total

Intercept Number of Sales Contacts (Monthly) Region Indicator 1

2 32 34

SS MS F Significance F 4006.414947 2003.207 13.51312 5.56771E-‐05 4743.73191 148.2416 8750.146857

Coefficients Standard Error t Stat P-‐value Lower 95% -‐11.10718667 11.30939798 -‐0.98212 0.333408 -‐34.1436764

0.822594987 10.67039881

0.175628992 4.683708 4.97E-‐05 0.464850439 4.16780093 2.560199 0.015384 2.180866167

The model can be interpreted as follows: Sales = -$11,107.19 + $822.59 X Number of Sales Contacts + $10,670.40 for the central region, and Sales = -$11,107.19 + $822.59 X Number of Sales Contacts for the north or southwest regions.

422

Instructor's Solutions Manual – Chapter 14

29. Excel's Regression output for the model including both number of employees and shift is shown below.

SUMMARY OUTPUT Regression Statistics Multiple R 0.40825172 R Square 0.166669467 Adjusted R Square 0.128790806 Standard Error 4165.200265 Observations 47 ANOVA df Regression Residual Total

2 44 46

SS MS F Significance F 152673338.7 76336669 4.400089 0.018112587 763351302.8 17348893 916024641.5

It appears the overall model is significant, however, shift is not a significant explanatory variable when the number of employees is included in the model. As well, if the model is run with only shift included as an explanatory variable, it is not significant. Therefore, it appears that shift is not a useful explanatory variable for the number of units produced. As well, the model based on number of employees, while significant, is not a particularly useful model (the adjusted R2 is only 0.15).

423

Instructor's Solutions Manual – Chapter 14

30. Province is not a significant explanatory variable for wages and salaries, either as the sole explanatory variable (p-value for the F-test = 0.67), or when age is included in the model (pvalue for the test of the province indicator variable coefficient = 0.76). While it appears that the model including age alone is significant, the required conditions are not met. See the residual plot shown below. The plot clearly shows a pattern of increasing variability for higher ages.

Age Residual Plot 200000 150000

Residuals

100000 50000 0 -‐50000 -‐100000 0

Age

Chapter Review Exercises 1. The model can be interpreted as follows: $Monthly Credit Card Balance = $38.36 + $0.99(Age of Head of Household) + $22.04(Income in thousands of dollars) + $0.38(Value of the home in thousands of dollars). Generally, monthly credit card balances are higher for older heads of household with higher incomes and more expensive homes.

424

Instructor's Solutions Manual – Chapter 14

H0: β1 = β2 = β3 = 0 H1: At least one of the βi’s is not zero. α = 0.05 From the Excel output, we see that F = 5.96. The F distribution will have (3, 31) degrees of freedom. We estimate that the p-value < 0.01. There is strong evidence that the overall model is significant.

Age of head of household: H0: β1 = 0 H1: β1 ≠ 0

(from Excel output). The p-value is 0.93, so we fail to reject H0. There is not enough evidence to conclude that age of head of household is a significant explanatory variable for credit card balances, when household income and value of the home are included in the model. Income ($000): H0: β2 = 0 H1: β2 ≠ 0

(from Excel output). The p-value is 0.02, so we reject H0. There is enough evidence to conclude that household income is a significant explanatory variable for credit card balances, when age of head of household and value of the home are included in the model.

425

Instructor's Solutions Manual – Chapter 14

Value of home ($000): H0: β3 = 0 H1: β3 ≠ 0

(from Excel output). The p-value is 0.92, so we fail to reject H0. There is not enough evidence to conclude that value of the home is a significant explanatory variable for credit card balances, when age of head of household and household income are included in the model. 4.

As we might expect, age of head of household is fairly highly correlated with household income and value of home. When age and income are included in the model, age is not a significant explanatory variable. When age and home value are included in the model, neither is a significant explanatory variable, although the model is significant. When income and home value are included in the model, only income is a significant explanatory variable. When age, income and home value are included in the model, it is significant but none of the explanatory variables is significant. It appears there is multicollinearity among these variables.

Because the tests are done in the same format as the final exam, it is expected that the tests will prove to be better predictors of the final exam mark. However, good knowledge of the material is likely to result in higher marks for all of the evaluations, so we must consider that any one of them could be a good predictor of the final exam mark.

None of the correlations between the explanatory variables is particularly high. There is a fairly high correlation between the mark on Test #2 and the final exam mark, which suggests that the mark on Test #2 might be a good explanatory variable for the final exam mark.

The model that predicts the final exam mark on the basis of the mark on Test #2 is clearly the best. The adjusted R2 is higher, the standard error is lower, than for all the other variations. As well, the model is significant (p-value for the F-test is approximately zero).

426

Instructor's Solutions Manual – Chapter 14

We have 95% confidence that the interval (675.45, $2,486.74) contains the monthly credit card bill for a head of household aged 45, with annual income of $65,000 and a home valued at $175,000. The Excel output is shown below (split for visibility). Confidence Interval and Prediction Intervals -‐ Calculations Point 95% = Confidence Level (%) Age of Head of Value of Home Number Household Income (000) (000) 1 45 65 175

Prediction Interval Lower limit Upper limit 675.4457684 2486.743786

All of the models that include Test #2 as an explanatory variable are better than those which do not. Test #2 was the basis for the best model when only one explanatory variable was included in the model, and so this is not surprising. The two-variable model with the highest adjusted R2 contains Test #2 and Assignment #2. Adding Assignment #2 to Test #2 as an explanatory variable increases the adjusted R2 value from 0.51 to 0.57, and the standard error decreases from 14.3 to 13.4. Prediction and confidence intervals made with the two-variable model would be narrower than for the model with only Test #2. The two-variable model is better, but whether it is "best" depends on the way the model might be used. Suppose it is being used to predict the exam marks, and identify those who are in danger of failing the course, or not achieving a grade level necessary for external accreditation. Test #2 is a significant explanatory variable. If Assignment #2 comes much later in the course, it may be better to use the single-variable model, so that the student can be alerted to a potential problem earlier, with time for adjustments.

427

Instructor's Solutions Manual – Chapter 14

10. The residual plots for this model are shown below. All have the desired appearance.

Assignment #2 Residual Plot 30

Residuals

10 0 -‐10 0

100

150

-‐20 -‐30

Assignment #2

Test#2 Residual Plot 30

Residuals

20 10 0 -‐10

100

120

-‐20

-‐30

Test#2

Residuals vs. Predicted Exam Mark (Test #2 and Assignment #2) 30

Residual

20 10 0 -‐10 -‐20 -‐30 0

100

120

Predicted Exam Mark

428

Instructor's Solutions Manual – Chapter 14

There is one data point that produces a standardized residual that is greater than 2 (observation 69). However, there is no way to double-check this point. It appears this model meets the required conditions. 11. None of the three-variable models represents a real improvement on the model which includes Test #2 and Assignment #2. As we might expect, the best three-variable models include both Test #2 and Assignment #2. The best of these, in terms of higher adjusted R2, also contains Test #1. However, Test #1 is not significant as an explanatory variable when Test #2 and Assignment #2 are included in the model. This is true for all the other models that include both Test #2 and Assignment #2: the third explanatory variable is not significant when Test #2 and Assignment #2 are included in the model. 12. The Excel Regression output for the model containing all possible explanatory variables is shown below. SUMMARY OUTPUT Regression Statistics Multiple R 0.775514198 R Square 0.601422271 Adjusted R Square 0.579030264 Standard Error 13.22677489 Observations 95 ANOVA df Regression Residual Total

Intercept Assignment #1 Test #1 Assignment #2 Test#2 Quizzes

5 89 94

SS MS F Significance F 23494.40275 4698.881 26.85879 1.85347E-‐16 15570.33409 174.9476 39064.73684

Coefficients Standard Error t Stat P-‐value Lower 95% 18.08370688 5.405029406 3.345719 0.001203 7.344028811 0.100148977 0.078524761 1.275381 0.205494 -‐0.055878047 0.149314405 0.085242328 1.751646 0.083279 -‐0.020060282 0.134964923 0.056516334 2.388069 0.019051 0.022668174 0.442801866 0.076755029 5.769027 1.14E-‐07 0.290291262 0.023581516 0.064451647 0.365879 0.715323 -‐0.104482531

Again we see that the explanatory variables other than Test #2 and Assignment #2 are not significant in this model. This is not the best model.

429

Instructor's Solutions Manual – Chapter 14

13. Using the model that includes Test #2 and Assignment #2, the Excel output is shown below (split for visibility): Confidence Interval and Prediction Intervals -‐ Calculations Point 95% = Confidence Level (%) Number Assignment #2 Test#2 1 65 70

Prediction Interval Lower limit Upper limit 48.90969082 102.4567362

We have 95% confidence that the interval (48.9, 100) contains the final exam mark of a student who received a mark of 65 on Assignment 2 and 70 on Test 2. After all the analysis, it appears that the best model generates a prediction interval so wide that it is not really useful.

430

Instructor's Solutions Manual – Chapter 14

14. The Excel output for all possible regressions calculations is shown below. Multiple Regression Tools-‐All Possible Models -‐ Calculations

Model Number Adjusted R^2 Standard Error K Significance F 1 0.325238882 2007.384641 1 0.001726878 Variable Labels Coefficients p-‐value Intercept -‐3294907.127 0.00180397 Year 1651.875001 0.001726878

Model Number Adjusted R^2 Standard Error K Significance F 2 0.420323605 1860.580138 1 0.00027344 Variable Labels Coefficients p-‐value Intercept 21217.96234 1.44594E-‐15 Kilometres -‐0.056903836 0.00027344

Model Number Adjusted R^2 Standard Error K Significance F 3 0.519585698 1693.805491 2 0.000120809 Variable Labels Coefficients p-‐value Intercept -‐2076840.42 0.026740574 Year 1045.988481 0.025384546 Kilometres -‐0.042999778 0.00403536

All three models are significant. The model with the highest adjusted R2 is the one that includes both year and kilometres as explanatory variables. Both year and kilometres are significant explanatory variables, when the other variable is included in the model.

431

Instructor's Solutions Manual – Chapter 14

List Price, Used Cars, 2004-‐2008 Model Year

However, this model presents some problems. Initial scatter diagrams for each of the explanatory variables are shown below.

Honda Accord Prices on AutoTrader.ca (November 2008) $25,000 $20,000 $15,000 $10,000 $5,000 $0 2003

2004

2005

2006

2007

2008

List Price, Used Cars, 2004-‐2008 Model Year

In this scatter diagram, we see there is more variability for list prices for older cars. Note there is only one data point for a car from the 2007 model year.

Honda Accord Prices on AutoTrader.ca (November 2008) $25,000 $20,000 $15,000 $10,000 $5,000 $0 0

50,000

100,000

150,000

200,000

Kilometres

Here, we see there is more variability in list prices for Honda Accords with higher kilometres.

432

Instructor's Solutions Manual – Chapter 14

With both these explanatory variables included in the model, the residual plots are as shown below.

Residuals

Year Residual Plot 6000 5000 4000 3000 2000 1000 0 -‐1000 -‐2000 -‐3000 2003

2004

2005

2006

2007

2008

Year

Residuals

Kilometres Residual Plot 6000 5000 4000 3000 2000 1000 0 -‐1000 -‐2000 -‐3000 0

50000

100000 Kilometres

150000

200000

Both of these plots show an unusual observation, which is circled. Both refer to the observation where a Honda Accord with 121,353 kilometres, 2005 model year, was listed for $19,888. There may be something unusual about this observation to explain why the list price is so unusually high for a relatively high-mileage (kilometrage!) car. Referring back to the original listing, if it were available, might tell us something to explain this. Since we do not have this information available, we cannot assess if this point is legitimate.

433

Instructor's Solutions Manual – Chapter 14

Residuals vs. Predicted List Price (Year and Kilometres)

6000

Residuals

4000 2000

0 -‐2000 -‐4000 0

5000

10000

15000

20000

25000

Predicted List Price

The outlier that showed up on the other residual plots also shows up here. There are some concerns about the model. The standard error is fairly wide, so, for example, if we predicted the list price of a 2005 Honda Accord with 85,000 kilometres, the prediction interval would be ($13,114.99, $20,308.02). Therefore, the model is not that useful for predicting the list price of a used Honda Accord. 15. Year of the car is not really a quantitative variable. There are four years (2004, 2005, 2006, and 2007) in the sample data set, so three indicator variables are required. They could be set up as follows: Year Indicator Variable 1 Indicator Variable 2 Indicator Variable 3 2004 1 0 0 2005 0 1 0 2006 0 0 1 2007 0 0 0 All possible regressions calculations provide many possible models. However, notice again that there is only one observation for the year 2007. The data set is not really large enough to support this analysis. We will proceed, out of curiosity.

434

Instructor's Solutions Manual – Chapter 14

The model with the highest adjusted R2 contains kilometres and only the indicator variable specifying whether the car is from the 2004 model year, or not. Model Number Adjusted R^2 Standard Error K Significance F 5 0.590043715 1564.675734 2 2.11076E-‐05 Variable Labels Coefficients p-‐value Intercept 21641.08247 8.12791E-‐17 Kilometres -‐0.049699268 0.000244684 1=Year 2004, 0=Not Year 2004 -‐2071.672715 0.003726813

The model is as follows: For the model year 2004: List price = $21,641.08 – 0.05(Kilometres) - $2,017.67 For model years 2005, 2006, and 2007: List price = $21,641.08 – 0.05(Kilometres) Notice that this model is more intuitive than the model from Exercise 14, which was: List price = -$2,076,840.42 + $1,045.99(Year) + 0.043(Kilometres) Such a model does not really make sense, and this should have been your clue that treating the year of a car as a quantitative variable is not the correct approach.

435

Instructor's Solutions Manual – Chapter 14

16. All seven possible regression models are significant, as the output for all possible regressions calculations shows. The output is split over two pages. Multiple Regression Tools-‐All Possible Models -‐ Calculations

Model Number 1 Variable Labels Intercept Local Population

Adjusted R^2 Standard Error K Significance F 0.279439295 1587.969587 1 0.001578561 Coefficients p-‐value 26728.22947 4.10163E-‐28 0.014837063 0.001578561

Model Number Adjusted R^2 Standard Error K Significance F 2 0.422781893 1421.270907 1 6.02729E-‐05 Variable Labels Coefficients p-‐value Intercept 17430.60759 4.51747E-‐08 Median Income in Local Area 0.163329197 6.02729E-‐05

Model Number 3 Variable Labels Intercept Estimated Traffic Volume (Weekly)

Adjusted R^2 Standard Error K Significance F 0.243535302 1627.051224 1 0.003278203 Coefficients p-‐value 23903.47264 5.63582E-‐16 0.188766345 0.003278203

Model Number Adjusted R^2 Standard Error K Significance F 4 0.461214393 1373.140214 2 9.0189E-‐05 Variable Labels Coefficients p-‐value Intercept 18991.09596 2.35328E-‐08 Local Population 0.007473002 0.094817808 Median Income in Local Area 0.127329166 0.003228531

436

Instructor's Solutions Manual – Chapter 14

Model Number 5 Variable Labels Intercept Local Population Estimated Traffic Volume (Weekly)

Adjusted R^2 Standard Error K Significance F 0.51721688 1299.819151 2 2.0497E-‐05 Coefficients p-‐value 22441.81416 6.68122E-‐17 0.014268244 0.000332706 0.180554925 0.000664511

Model Number Adjusted R^2 Standard Error K Significance F 6 0.485521563 1341.808324 2 4.83609E-‐05 Variable Labels Coefficients p-‐value Intercept 16748.85286 4.97853E-‐08 Median Income in Local Area 0.133898131 0.000822866 Estimated Traffic Volume (Weekly) 0.110679438 0.045105932

Model Number Adjusted R^2 Standard Error K Significance F 7 0.567509125 1230.255648 3 1.49902E-‐05 Variable Labels Coefficients p-‐value Intercept 18633.31979 5.65578E-‐09 Local Population 0.009787852 0.020230685 Median Income in Local Area 0.079865392 0.052205498 Estimated Traffic Volume (Weekly) 0.13655739 0.010370099

None of the one-variable models seems useful, as the adjusted R2 is quite low. Of the twovariable models, the most promising is Model Number 5, with local population and estimated weekly traffic volume as explanatory variables. The model is significant, and each of the explanatory variables is significant, when the other one is included in the model. The adjusted R2 is 0.52, which is not high, but still better than the other two-variable models. At first, it appears that Model Number 7, which includes all three explanatory variables, might be best, as it has the highest adjusted R2 of all the models. However, note that median income in the local area is not a significant explanatory variable (with a 5% significance level), when the other two variables are included in the model. as well, not that the standard error for this model is almost the same as for Model Number 5, which relies on only two explanatory variables.

437

Instructor's Solutions Manual – Chapter 14

Therefore, we will investigate Model Number 5 to see if it conforms to the required conditions. First, such a model makes some sense. Initial scatter diagrams for monthly sales and each explanatory variable show some evidence of a positive linear relationship, although neither relationship looks particularly strong. Note that the scales on some of the axes in the graphs below do not start at zero.

Monthly Sales

Monthly Sales and Local Population $33,000 $32,000 $31,000 $30,000 $29,000 $28,000 $27,000 $26,000 $25,000 $24,000 0

50,000 100,000 150,000 200,000 250,000 300,000 Local Population

Monthly Sales

Monthly Sales and Weekly Traffic $33,000 $32,000 $31,000 $30,000 $29,000 $28,000 $27,000 $26,000 $25,000 $24,000 10,000

15,000

20,000

25,000

30,000

35,000

40,000

Estimated Traffic Volume (Weekly)

438

Instructor's Solutions Manual – Chapter 14

Residual plots show (more or less) the desired horizontal band appearance, as shown below.

Local Population Residual Plot 3000

Residuals

2000 1000 0 -‐1000 -‐2000 -‐3000 -‐4000 0

100,000

200,000

300,000

Local Population

Estimated Traffic Volume (Weekly) Residual Plot 3000

Residuals

2000 1000 0 -‐1000 -‐2000 -‐3000 -‐4000 10,000 15,000 20,000 25,000 30,000 35,000 40,000 Estimated Traffic Volume (Weekly)

439

Instructor's Solutions Manual – Chapter 14

Residuals, Monthly Sales Model, Doughnut Shop (Population and Weekly Traffic as Explanatory Variables)

3000 2000

Residual

1000 0 -‐1000 -‐2000 -‐3000 -‐4000 25000 26000 27000 28000 29000 30000 31000 32000 33000 Predicted Monthly Sales

A histogram of the residuals is approximately normal. There do not appear to be any outliers or influential observations.

Residuals, Monthly Sales Model, Doughnut Shop (Population and Weekly 12

Traffic as Explanatory Variables)

Frequency

10 8 6 4 2

0 Residual

It appears the model meets the required conditions. This model could be the basis of a location decision. The form of the model is as follows: Predicted Monthly Sales = $22,441 + 0.0143(Local Population) + 0.1806(Estimated Weekly Traffic Volume)

440

Instructor's Solutions Manual – Chapter 14

17. While it is tempting the add the new data and analyze the model which was best from the analysis we did for Exercise 16, the correct approach is to look at all possible models. We have to allow for the possibility that the new information ALONE will be the basis of the most important explanatory variable. In fact, the output of all possible regressions calculations shows that inclusion of the indicator variable for the location being within a five-minute drive of a major highway does improve the model we chose as best for Exercise 16. However, the best of all of the models, in terms of adjusted R2, is the model with all possible explanatory variables. The adjusted R2 for this model if 0.656, compared with 0.517 for the preferred model in Exercise 16. The data requirements for this model are more onerous, and this would have to be taken into consideration before the model was selected. While local population and median incomes could be obtained through Statistics Canada, information about estimated weekly traffic volume will probably have to be collected (and possibly over several weeks). However, the information about whether a location is within a five-minute drive of a major highway could be obtained by looking at road maps and estimating driving distance. We will analyze the "all-in" model to see if it conforms to required conditions. Residual plots look acceptable. The histogram of residuals appears normally-distributed. There are no obvious outliers or influential observations.

Residuals

Local Population Residual Plot 3000 2000 1000 0 -‐1000 -‐2000 -‐3000 0

50,000 100,000 150,000 200,000 250,000 300,000

Local Population

441

Instructor's Solutions Manual – Chapter 14

1=Within Five-‐Minute Drive of Major Highway, 0=Otherwise Residual Plot Residuals

4000 2000 0 -‐2000 0 -‐4000

0.2

0.4

0.6

0.8

1.2

1=Within Five-‐Minute Drive of Major Highway, 0=Otherwise

Residuals

Median Income in Local Area Residual Plot 3000 2000 1000 0 -‐1000 -‐2000 -‐3000 40000

50000

60000

70000

80000

90000

Median Income in Local Area

Residuals

Estimated Traffic Volume (Weekly) Residual Plot 3000 2000 1000 0 -‐1000 -‐2000 -‐3000 10,000 15,000 20,000 25,000 30,000 35,000 40,000

Estimated Traffic Volume (Weekly)

442

Instructor's Solutions Manual – Chapter 14

Doughnut Shop Sales Prediction Model (All Explanatory Variables) 3000

Residual

2000 1000 0 -‐1000 -‐2000

-‐3000 25000

27000

29000

31000

33000

Predicted Monthly Sales

Doughnut Shop Sales Prediction Model (All Explanatory Variables) 12

Frequency

10 8 6 4 2 0 Residual

It appears the model meets the required conditions. The model is as follows: For locations within a five-minute drive of a major highway: $Predicted Monthly Sales = $17,413 + 0.009(Local Population) + 0.089(Median Income in Local Area) + 0.145(Estimated Weekly Traffic Volume) +$1,137 For locations not within a five-minute drive of a major highway: $Predicted Monthly Sales = $17,413 + 0.009(Local Population) + 0.089(Median Income in Local Area) + 0.145(Estimated Weekly Traffic Volume) 18. a. Since the Canadian economy is resource-based, it does not seem unusual to look to resource stocks as a stand-in for the entire stock index. Once could argue that as the economy overall goes, so will go the financial sector and the stock index. The Rona stock seems less likely, ahead of time, to be a good predictor of the TSX. However, we will begin, as usual, by looking at all possible models.

443

Instructor's Solutions Manual – Chapter 14

However, when we do this, we do not find a useful model. Of the one-variable models, the best is the one that predicts the TSX on the basis of the price of Potash Corporation stock. However, although the model is significant, the adjusted R2 is quite low, and the standard error is relatively high. When we examine the two-variable models, all of them have at least one variable that is not significant in the model when the other is present. All the three- and four-variable models show the same problem. This is not surprising, as all of the stocks and the TSX will be affected by overall economic conditions. Some multi-collinearity is the likely result. Scatter diagrams for the TSX and each stock price also hint that there does not appear to me a linear relationship between these stock prices and the TSX.

TSX

TSX and Royal Bank Stock Price 16000 14000 12000 10000 8000 6000 4000 2000 0 0

100

Royal Bank Stock Price

TSX

TSX and Rona Inc. Stock Price 16000 14000 12000 10000 8000 6000 4000 2000 0 0

Rona Inc. Stock Price

444

Instructor's Solutions Manual – Chapter 14

TSX

TSX and Petro Canada Stock Price 16000 14000 12000 10000 8000 6000 4000 2000 0 0

100

Petro Canada Stock Price

TSX

TSX and Potash Corp. Stock Price 16000 14000 12000 10000 8000 6000 4000 2000 0 50

100

150

200

250

Potash Corp. Stock Price

The only relationship that looks somewhat linear is the last one, between the TSX and the Potash Corporation stock price, and there is clearly more variability in the TSX for prices in the lower part of the Potash Corporation stock price range for these data. This helps explain why the only model that appeared to have any predictive power was the one based on the stock price of the Potash Corporation. b. There is definitely evidence of the stock market crisis at the end of 2008. For example, if we examine the residuals for the model based on the Potash Corporation stock price, they show a definite time-related pattern, as shown below. It is not a good idea to try to build a model using data for this time period. Whatever relationships may have held before the fall of 2008, it appeared that financial markets were increasingly unpredictable, and the new information that became available at the time of the crisis may change forever the way stock markets work.

445

Instructor's Solutions Manual – Chapter 14

TSX Model, Residuals Over Time (Potash

01/11/2008

01/07/2008

01/03/2008

01/11/2007

01/07/2007

01/03/2007

01/11/2006

01/07/2006

01/03/2006

01/11/2005

01/07/2005

01/03/2005

01/11/2004

01/07/2004

01/03/2004

01/11/2003

01/07/2003

01/03/2003

01/11/2002

Residual

Corporation Stock Price as Explanatory Variable) 6000 5000 4000 3000 2000 1000 0 -‐1000 -‐2000 -‐3000 -‐4000

19. There are many possible models. However, for many of the models, when the overall model is significant, some of the individual explanatory variables are not significant, given the other explanatory variables in the model. This is not surprising, as the factors that lead to student success in one subject probably contribute to student success in other subjects. The best one-variable model is based on the mark in Intermediate Accounting 1. The best two-variable model includes the mark in Intermediate Accounting 1 and Cost Accounting 1. Model results are summarized below. Model Number Adjusted R^2 Standard Error K Significance F 1 0.520480342 12.93893991 1 2.05994E-‐09 Variable Labels Coefficients p-‐value Intercept 17.2097272 0.008500616 Intermediate Accounting 1 0.711183518 2.05994E-‐09 Model Number Adjusted R^2 Standard Error K Significance F 6 0.590366643 11.95895263 2 2.92403E-‐10 Variable Labels Coefficients p-‐value Intercept 14.52779938 0.016864759 Intermediate Accounting 1 0.420200699 0.002427925 Cost Accounting 1 0.377202568 0.003952342

Of these two, Model Number 6 appears to be the better model, with a higher adjusted R2, and somewhat lower standard error.

446

Instructor's Solutions Manual – Chapter 14

20. Scatter diagrams of the Statistics 1 mark and each of the marks in Intermediate Accounting 1 and Cost Accounting 1 are shown below. Both relationships appear linear.

Student Marks in Intermediate Accounting 1 and Statistics 1 Mark in Statistics 1

100 80 60 40 20 0 0

100

120

Mark in Intermediate Accounting 1

Student Marks in Cost Accounting 1 and Statistics 1 Mark in Statistics 1

100 80 60 40 20 0 0

100

120

Mark in Cost Accounting 1

447

Instructor's Solutions Manual – Chapter 14

Residual plots appear to be as desired.

Intermediate Accounting 1 Residual Plot 40

Residuals

20 0

-‐20

-‐40

100

120

Intermediate Accounting 1

Cost Accounting 1 Residual Plot 40

Residuals

20 0

-‐20

-‐40

100

120

Cost Accounting 1

Residuals vs. Predicted Statistics 1 Mark (Cost Accounting 1 and Intermediate Accounting 1 as Explanatory Variables) 30 20

Resdiaul

10 0 -‐10 0

100

-‐20 -‐30 -‐40

Predicted Statistics 1 mark

448

Instructor's Solutions Manual – Chapter 14

The histogram of residuals is skewed to the left, as shown below. However, generally, the residuals appear to be normally-distributed.

Frequency

Residuals vs. Predicted Statistics 1 Mark (Cost 20 18 16 14 12 10 8 6 4 2 0

Accounting 1 and Intermediate Accounting 1 as Explanatory Variables)

Residual

There are some data points with standardized residuals either ≤ -2 or ≥ +2. However, we have no way to verify these data points, so for now, we have no choice but to leave them in the model. It appears that the model to predict the Statistics 1 mark on the basis of marks in Cost Accounting 1 and Intermediate Accounting 1 meets the required conditions. 21. We have 95% confidence that the interval (42, 91) contains the Statistics 1 mark of an individual student who achieved a mark of 65 in Cost Accounting 1 and Intermediate Accounting 1.

449

Instructor's Solutions Manual – Chapter 14

22. Of all the one-variable models, the best is the one based on years of experience. Of all the other models, the one based on years of experience and the local advertising budget is best. This model has an adjusted R2 of 0.95. The model seems sensible. Sales = $12,260 + $1,185(Years of Experience) + $4(Local Advertising Budget) It seems reasonable to expect that salespeople would increase their skill as they gain years of experience, and this could result in increased sales. It also seems likely that increases in the local advertising budget would lead to increases in sales. Scatter diagrams for each explanatory variable and sales are shown below. Note the vertical axes on each graph does not start at zero.

Sales

Sales and Years of Experience $80,000 $75,000 $70,000 $65,000 $60,000 $55,000 $50,000 $45,000 $40,000 0

Years of Experience

Sales

Sales and Local Advertising Budget $80,000 $75,000 $70,000 $65,000 $60,000 $55,000 $50,000 $45,000 $40,000 $3,000 $3,500 $4,000 $4,500 $5,000 $5,500 $6,000 $6,500 $7,000 Local Advertising Budget

450

Instructor's Solutions Manual – Chapter 14

There appears to be a strong linear relationship between sales and years of experience. The relationship between the local advertising budget and sales is less obvious. However, these variables in combination appear to provide the best model for sales. The residual plots appear to have the desired horizontal band appearance. There is one point that appears unusual in all three plots (it is indicated with a triangular marker). All three points correspond to the same observation, which is the 40th data point. If we had the ability to double-check the accuracy of this point, we would. This data point is the only one with a standardized residual either ≤ -2 or ≥ +2.

Local Advertising Budget Residual Plot

6000

Residuals

4000 2000 0 -‐2000

-‐4000 -‐6000 2000

3000

4000

5000

6000

7000

Local Advertising Budget

Years of Experience Residual Plot 6000

Residuals

4000 2000 0 -‐2000 -‐4000 -‐6000

Years of Experience

451

Instructor's Solutions Manual – Chapter 14

Residuals for Marchapex Sales Model (Years of Experience and Local Advertising Budget as Explanatory Variables) 5000

Residual

3000 1000 -‐1000 -‐3000 -‐5000 40000 45000 50000 55000 60000 65000 70000 75000 80000 Predicted Sales

The histogram of the residuals is somewhat bimodal, with some left-skewness. While not significantly non-normal, such a histogram suggests some caution when using the model.

Residuals for Marchapex Sales Model (Years of 14

Experience and Local Advertising Budget as Explanatory Variables)

Frequency

12 10 8 6 4 2 0 Residual

A 95% prediction interval for a salesperson with 15 years of experience, and a local advertising budget of $4,000 would be ($42,081, $50,035).

452

Instructor’s Manual Lawrence Tenenbaum Melanie Christian

For

Analyzing Data and Making Decisions Statistics for Business Second Edition

Judith Skuce

Contents Chapter 1: Using Data to Make Better Decisions........................................... 1 Chapter 2: Using Graphs and Tables to Describe Data .................................. 8 Chapter 3: Using Numbers to Describe Data ............................................... 20 Chapter 4: Calculating Probabilities............................................................. 30 Chapter 5: Probability Distributions............................................................. 37 Chapter 6: Using Sampling Distributions to Make Decisions...................... 45 Chapter 7: Making Decisions with a Single Sample .................................... 58 Chapter 8: Estimating Population Values..................................................... 68 Chapter 9: Making Decisions with Matched-Pairs Samples, Quantitative or Ranked Data.................................................... 73 Chapter 10: Making Decisions with Two Independent Samples, Quantitative or Ranked Data.................................................... 79 Chapter 11: Making Decisions with Three or More Samples, Quantitative Data –Analysis of Variance (ANOVA) .............. 83 Chapter 12: Making Decisions with Two or More Samples, Qualitative Data ....................................................................... 91 Chapter 13: Analyzing Linear Relationships, Two Quantitative Variables ..................................................... 96 Chapter 14: Analyzing Linear Relationships, Two or More Variables ......................................................... 101 Appendix: Solutions to Odd-Numbered Exercises .................................... 107

Chapter 1 Using Data to Make Better Decisions Learning Objectives: 1.

Understand the approaches to gathering data.

Understand why sampling is necessary.

Recognize that there is art and science to summarizing and analyzing data.

Recognize that cause-and-effect conclusions must be drawn carefully.

Understand that clear and honest communication of results is necessary for them to be useful.

Be familiar with a framework for data-based decision making.

Chapter Outline: 1.1 Getting the Data 1.2 Sampling 1.3 Analyzing the Data 1.4 Making Decisions 1.5 Communication 1.6 A Framework for Data-Based Decision Making Overview: Chapter 1 provides a general framework for data-based decision making, something that is rarely provided in introductory statistics texts. The goal is to provide a big picture to students, so that from the beginning, they can understand the benefits of working through a course in statistics.

1.1 Getting the Data Primary Data: data that you collect yourself for a specific purpose.

Secondary Data: data that were previously collected, not for your specific purpose.

After defining the two basic types of data, primary and secondary, the instructor should stimulate discussion about examples of each type. Attention then should be given to the sources of secondary data, such as Statistics Canada and other various governmental agencies, magazines, newspapers, library sources and so forth. Some example references are given in the textbook on page 4.

1.2 Sampling A. Population Data: the complete collection of all the data of interest. B. Sample Data: a subset of the population data. Discuss why sampling as a tool is used to make decisions instead of analyzing all data of the population. The reasons we sample and the idea that sampling can lead to reliable conclusions about populations should be introduced at this time. These ideas (cost, time, difficulty in surveying, etc.) may be totally new to students, and without them, students may be unable to see why statistical inference is important or interesting. Non-Statistical Sampling: The elements of the population are chosen for this sample by convenience or according to the researcher's judgment. There is no way to estimate the probability that any particular element from the population will be chosen for the sample. Also known as nonprobability sampling.

Statistical Sampling: Elements of the population are chosen for the sample in a random fashion, with a known probability of inclusion. Also, known as probability sampling.

Inferential Statistics: A set of techniques that allow reliable conclusions to be drawn about population data, on the basis of sample data. Parameter: A summary measure of the population data.

Sample Statistic: A summary measure of the sample data.

Simple Random Sampling: A sampling process that ensures that each element of the population is equally likely to be selected.

Frame: A list of elements in a population.

The only sampling plan discussed in detail in this introductory text is simple random sampling. One should only allude to the fact that other sampling techniques exist (because they normally are not applied in an introductory text). The challenges of obtaining truly random samples should be described, along with a very accessible Excel-based analogue for picking names out a hat. Ensure that the students realize that taking a true random sample requires thought and effort.

Sampling Error: The difference between the true value of the population parameter and the value of the corresponding sample statistic.

Nonsampling Errors: Other kinds of errors that can arise in the process of sampling of a population. Coverage Errors: Errors that arise because of inaccuracy or duplication in the survey frame. Nonresponse Error: Error that arises when the data cannot be collected for some elements of the sample. Response Errors: Errors that arise because of problems with the survey collection instrument (e.g., the questionnaire), the interviewer (e.g., bias), the respondent (e.g., faulty memory), or the survey process (e.g., not ensuring that the respondent fits into the target group).

Processing Errors: Errors that occur when the data are being prepared for analysis. Estimation Errors: Errors that arise because of incorrect use of techniques, or calculation errors.

Students should be cautioned about a whole range of possible errors that could occur when data are collected. The errors should be discussed with examples, in order for the students to fully comprehend the ‘error’ terminology and affects. Note that sampling error is expected, and is something that one can estimate and control. Nonsampling errors can invalidate the conclusions drawn from the sample. The importance of describing and summarizing data is introduced, with examples of graphical and numerical summaries.

1.3 Analyzing the Data It is important to impress upon the students that once one collects the data, one would need to take the information and organize it, so one can make sense of it! Descriptive Statistics: a set of techniques to organize and summarize raw data. Students should understand that the goal is to represent data truthfully. Data should not be confused or used to misrepresent. One must always be skeptical of statistical analysis done by others due to the possibility of misrepresentation and manipulation.

Make sure to dedicate adequate time in class to Example 1.3 on page 13. Histograms are the foundation for determining the normality of distributions and the appropriateness of particular statistical techniques to come later in the textbook. It should be emphasized that the horizontal scales should be identical when comparing two data sets. Excel has built-in tools for tallying data and for creating a histogram in the Data Analysis ToolPack. Students will need to practice these features. Teaching Tip: Have students check the tallies in the histograms by classifying the data by hand for Example 1.3 on page 13. Discuss with students the use of the same horizontal scale for comparisons between two data sets (in this case, winery purchases by men and by women).

1.4 Making Decisions When data has been collected and analyzed, one will be in the position to make decisions and conclusions. The conclusions derived will depend on how the data was gathered. Observational Study: The researcher observes what is already taking place, and does not attempt to affect outcomes.

Experimental Study: The researcher actively intervenes, designing the study so that conclusions about causation can be drawn. The differences between observational and experimental studies are introduced, and a general discussion of the challenges of drawing conclusions about causality is provided. Judgment is required to interpret the results of statistical analyses. In the cases where judgment is required, one must be prepared to defend choices objectively. One should always critically evaluate the judgments made by others in the content of their statistical analyses. This kind of discussion is often glossed over in introductory courses (if it is mentioned at all), but it is extremely important, and can be understood quite easily at a general level. If the students achieve nothing else in their statistics course, they should at least learn to avoid ridiculous conclusions that are not supported by the data. All students should gain an understanding that good judgments develop with practice, experience, and reflection. This does not mean that anything goes, or that students can “prove” anything with statistics.

1.5 Communication The approach, as with many statistics texts, centres on techniques for analyzing data and making decisions about population data on the basis of sample data. Once one is successful at mastering the techniques for gathering data and making interpretations, the next important step is to be able to communicate clearly both the methodology and conclusions drawn based on the results. There is a discussion of the importance of writing clear reports to summarize any statistical analysis. Students may get instruction about writing reports in other courses, but writing about statistical analysis has particular challenges, which are briefly described in the text. The guidelines for good

communication can be reinforced through the requirement that written reports be part of any assignments given in the course.

1.6 A Framework for Data-Based Decision Making The textbook outlines a general approach to data-based decision making. Points to consider: 1. 2. 3. 4. 5. 6.

Completely describe the goal of any decisions to be made. Provide information about how the data were collected, organized and summarized. Present both graphs, and numerical support (i.e., in forms of summarizing tables). Describe the statistical techniques used. Describe and explain judgments about areas of uncertainty that apply. Make a clear statement of conclusions or decisions, with justification (i.e., support the results).

The steps in the data-based decision-making process are summarized below. Steps in data-based decision making: 1.

Understand the problem and its context as thoroughly as possible. Be clear about the goal of good decision making.

Think about what kind of data would help you make a better decision. See if helpful data are already available somewhere. Decide how to collect data if necessary, keeping in mind the kind of conclusion you want to be able to make.

Collect the data, if the benefit of making a better decision justifies the cost of collecting and analyzing the data.

Examine and summarize the data (methods of descriptive statistics).

Analyze the data in the context of the decision that is required. (Focus on inferential statistics.) This may require using the sample data to:  Estimate some unknown quantity.  Test if a claim (hypothesis) seems to be true.  Build a model of relationships between two quantitative variables

6. Communicate the decision-making process: This requires:  A clear statement of the problem at hand, and the goal of the decision.  A description of how the data were collected or located.  A summary of the data.  A description of the estimation, hypothesis test, or model-building process(es).  A statement of what decision should be made with justification.

There are some important ideas in Chapter 1, and adequate time should be devoted to them. These ideas do not particularly require computers or calculators, and so those students who have unreasonable fears about “the numbers” may find the general discussions of the material in Chapter 1 somewhat comforting. Generally, it does not take a math whiz to understand these ideas, but they are fundamental to an understanding of why statistical analysis is important.

Go to MyStatLab at www.mathxl.com. Introduce students to MyStatLab. Students can practise the exercises indicated with red as often as they want, and guided solutions will help them find answers step by step. They’ll find a personalized study plan available to them too! Introduce this extremely useful tool early in your course.

Discussion Questions: 1.

What are some of the ways sampling errors can occur? Give examples of each, and discuss their possible impacts on statistical results. There are several ways sampling errors can occur, including coverage errors, nonresponse errors, response errors, processing errors, and estimation errors. Coverage errors occur because of inaccuracy or duplication in the survey frame. If a person is randomly selected to complete a survey and their identification number is missed or is recorded incorrectly (e.g., their phone number or address), then a coverage error has occurred. For example, consider a survey that was conducted with stratified random sampling, where a certain number of respondents were randomly selected from each region of a city. Stratified random sampling means that the sample is broken down into clusters to ensure adequate representation in each cluster. If a respondent is missed or identified incorrectly, then results could be skewed and incorrect. One region of the city could be overrepresented, or underrepresented. The smaller the sample size, the more serious the coverage error. Nonresponse errors occur if some respondents refuse or fail to answer the survey question. This can be quite serious, significantly affecting the results of a survey. For example, people often refuse to answer survey questions about their household income. If the purpose of the survey was to determine if a relationship between restaurant spending and household income exists, the purpose of the survey would be defeated if most respondents did not indicate their household income. If another survey is given to teenagers about their drug, alcohol, and sexual use/behaviour, the students may not respond if they fear their teachers or parents will see the results. Within a survey, some questions may have a good response rate, and other questions may be missing significant data. Thus, the sample size may be different for each question. Missing data must be handled very carefully. Other errors can arise in actually acquiring the data. If the survey questions are biased, misleading, long, or confusing, the respondent may not give accurate or truthful answers. For example, if a survey is repetitive and long, participants may simply select answers to finish the survey. Data acquisition, or response errors, can arise because of problems with the collection instrument (e.g., the questionnaire), the interviewer (e.g., bias), the respondent (e.g., faulty memory, false answers), or the survey process. Imagine that four surveys about radio use were sent to a particular home and respondents were asked to mail the surveys back. Only the wife completed her survey accurately and recorded the radio stations that she listened to each day. The wife then decided to fill out the other three surveys for her family, and she guessed at what stations her family members listened to each day. Thus, the final statistical analysis would be based on faulty data. Finally, processing errors can occur when data are being recorded and estimation errors happen when statistical techniques are applied incorrectly. Processing errors can occur in many ways including: 1) Numbers incorrectly reversed when initially recorded; 2) Data incorrectly transferred from the

handwritten survey responses to the computer; and 3) Data duplicated due to transcription or computer errors. Recall the example of a survey designed to determine the relationship between restaurant spending and household income. If only 30 people responded to the survey, and someone’s household income was incorrectly recorded as $100,000 when it should have been $10,000, the final statistical analysis will be wrong and biased. Furthermore, processing errors can occur if the statistical analysis being concluded is simply wrong. Certain conditions must be met before particular statistical tests can be performed. Violating these preconditions (such as checking for normality) can lead to completely faulty statistical analysis. I have sat through graduate-level lectures where students have incorrectly averaged ethnicities of survey responses to work out an average ethnicity! Students must be reminded that they cannot work out averages on nominal data. Brainstorm other ways these errors can occur with students, and return to these errors throughout the course. When students encounter statistics in their future jobs, these errors will become very real possibilities with serious consequences. 2.

What is meant by random sampling? Give examples of random sampling, and non-random sampling and their practical uses. Simple random sampling means that the elements of a population are chosen for the sample in a random fashion, with each element having an equal probability for inclusion. Taking random samples ensures that results from the sample are representative of the population. Imagine a researcher wants to take survey opinions on anti-smoking advertising on a local television network. He takes his sample by standing in a designated smoking area outside an office building. He selects every 5th person who enters the area to take part in his survey. Is this a random sample? No, this is not a random sample, because not everyone from the population has an equal probability for being included in the survey. His survey will have a disproportionate number of smokers, because he sampled in a smoking area. His survey is also not random for other reasons. The researcher has only sampled survey respondents from one particular office building at a certain time. This would definitely not be representative of the entire city. Designing and taking a simple random sample takes careful planning, time and effort. There are cases where nonstatistical sampling is useful. In nonstatistical sampling, the elements of the population are chosen for the sample by convenience or according to the researcher’s judgment. Nonstatistical sampling can be used to pilot survey questions, test new product ideas, and explore emerging trends. When people self-select themselves for a survey, the people are not chosen randomly, and this can significantly affect results. The results of such polls cannot be interpreted as representative of the entire population. In our college, survey booths are often set up in the cafeteria. Students are given incentives for completing questionnaires including free frisbees, candy, and stationary supplies. There is no way to ensure that the sample is random and representative of the entire student population, because students may only be approaching the survey booth because of the free gifts. Nonstatistical sampling must be interpreted with caution.

Chapter 2 Using Graphs and Tables to Describe Data Learning Objectives: 1.

Distinguish among different types of data.

Create frequency distributions and histograms to summarize quantitative data.

Create tables, bar graphs, and pie charts to summarize qualitative data.

Create time-series graphs.

Create scatter diagrams for paired quantitative data.

Be aware of and avoid common errors that result in misleading graphs, and understand the factors that distinguish interesting from uninteresting graphs.

Chapter Outline: 2.1 Types of Data 2.2 Frequency Distributions and Histograms for Quantitative Data 2.3 Tables, Bar Graphs, and Pie Charts for Qualitative Data 2.4 Time-Series Graphs 2.5 Scatter Diagrams for Paired Quantitative Data 2.6 Misleading and Uninteresting Graphs Overview: Chapter 2 starts with a description of different data types. The somewhat more accessible quantitative/qualitative/ranked descriptors are used, in preference to the less-easily understood interval/nominal/ordinal language. The graphs and tables are then organized according to data type.

2.1 Types of Data It should be noted that depending on the type of data involved, different methods are used to analyze data and make decisions. Understanding data types is crucial to one's ability to correctly identify the techniques that should be used. Variable: A characteristic or quantity that can vary. Data Set: Recordings of actual characteristics or quantities for a particular variable.

Quantitative Data: Data containing numerical information for which arithmetical operations such as averaging are meaningful; also, referred to as numerical data and sometimes interval or ratio data.

Qualitative Data: Data containing descriptive information which may be recorded in words or numbers (the numbers represent codes for the associated words being characteristics; arithmetical operations such as averaging are not meaningful in this case); also, referred to as nominal or categorical data.

At this point, the student should be gaining a strong appreciation for the difference between quantitative and qualitative data. It must be stressed that even though qualitative data can be represented numerically by tallies of frequencies, this form the data is representative of characteristics.

Continuous Variable: A quantitative variable being a measurement that can take any possible value on a number line (or range possibly within upper and lower limits).

Discrete Variable: A quantitative variable that can take on any certain identifiable values on a number line (possibly within upper and lower limits). Counts, by definition, are treated as discrete random variables. The counts represent characteristics that we are examining. Discrete variables can assume only certain values, and thus there are “gaps” between the values. Examples of discrete variables are such things as the number of rooms in the house, the number of cars arriving at a shopping center per hour, and the number of planes leaving an airport per hour. Quantitative variables such as age, amounts paid in dollars, and measurement, such as height or weight are considered continuous variables. Typically, continuous variables result from measuring something. On a number line, one can draw a solid line for a continuous variable because it can assume any value between two values (the upper and lower limits). Teaching Tip: I usually give my students ten examples of real-life situations, and have them classify the situations as continuous or discrete. Possible examples include patients’ weights (continuous), liquid drug dosages (continuous), the number of errors in an audit (discrete), time to complete a crossword puzzle (continuous), etc.

Ranked Data: Qualitative data that can be ordered according to size or quality, sometimes referred to as ordinal data. Ratings from good to excellent of such things as cleanliness, service quality, ease of locating products, and taste are examples of a ranked data. At this point, one should involve the students in describing other examples of ranked data. Cross-sectional Data: Data that are all collected in the same time period. Time-Series data: Data that are collected over successive points in time. An example of cross-sectional data would be the number of patrons to the theater in a particular month. Time-series data would be comparing the number of visitors to the theater in the same month over a period of years; thus, allowing one to compare results over time, or to have time-related averages. Teaching Tip: As an activity, I place students in partners and give them an envelope with real-life examples of each type of data (quantitative, qualitative, continuous, discrete, ranked, cross-sectional, and timeseries) from newspapers and magazines. I ask students to classify each type of data and justify their response. I preselect real-life data examples to ensure students have experience with each type of data, and to ensure I can provide students with solutions after the activity is complete. In the past, I have had provided students with newspapers and magazines to clip to complete this activity. This is one of the most popular activities of the semester, and it eases students into the study of statistics. Data classification is essential, because the appropriateness of statistical tests depends on it.

2.2 Frequency Distributions and Histograms for Quantitative Data Stem-and-Leaf Display: A statistical technique to present a set of data. Each numerical value is divided into two parts. The leading digit(s) becomes the stem and the trailing digit the leaf. The stems are located along the vertical axis, and the leaf values are stacked against each other along the horizontal axis.

The stem and leaf display, which students usually grasp easily, is provided as an introduction to both frequency distributions and histograms. This should provide students with an intuitive understanding of what a histogram shows. If students visualize the stem-and-leaf plot rotated 90 degrees counterclockwise, they can see the shape of the histogram. The stem becomes the categories on the horizontal axis. An advantage of the stem-and-leaf display over a frequency distribution is that we do not lose the identity of each observation. We are also able to see how the values within each class or range are distributed. Take a simple example of data and illustrate a stem and leaf display. High Temperature

Low Temperature

3 6 8

0 0 0 2 4 4 5 5 7 9

1 4 4 4 4 6 8

1 8

3 5 7 9

2 4 5 5

0 1 1 4 6

0 2 3

One can compare the ranges, in this case, of temperature. Notice, the stems are identical for comparison. Frequency Distribution: A summary table that divides quantitative data into non-overlapping ranges and records the number of observations or counts (the frequency) of data points in each range.

Histogram: A bar graph of a frequency distribution, where the variable of interest is shown along the horizontal or x-axis and class frequencies shown along the vertical or y-axis. The width of the bar is the class (or range) width. The bars, which are non-overlapping, are drawn adjacent to each other. A major advantage to organizing data into a frequency distribution is that we now get a quick visual picture of the shape of the distribution without doing any further calculations. These pictures can be used to begin to assess the appropriateness of particular statistical tests. Example of qualitative data frequency distribution with histogram: (Notice that for qualitative data, there are spaces between the classes represented by the bars)

The data refer to quality levels from 1 "Not at all Satisfied" to 7 "Extremely Satisfied." Rating 3 4 5 6 7

Frequency 2 4 12 24 18 60

Bar Graph

Example of quantitative data frequency distribution with histogram: (Notice that for quantitative data, there are no spaces between the classes represented by the bars) Computer Usage (Hours) 0.0 2.9 3.0 5.9 6.0 8.9 9.0 - 11.9 12.0 - 14.9 Total

Frequency 5 28 8 6 3 50

Class Width: One can use the following relationship as a guide to determine a class width for a frequency distribution: The important idea here is that there is more than one appropriate class size, and that some judgment is required to choose a correct one. Students are often not comfortable with this need for judgment (they want a rule, and to be sure their answers are “correct”). They will need some support for their use of the class width template, which provides guidance but not “the correct answer”. Demonstrating the use of the Class Width template a few times in class should help students develop their judgment about setting up classes for frequency distributions and histograms.

class width =

maximum value - minimum value ___________________________ ___

√n where n is the number of data points in the data set.

Judgment will always play a role in the final determination of the class width. 1. 2. 3. 4.

The classes should not overlap. The classes should all be the same width with open-ended classes avoided. The lower limit of the first class should be an even multiple of the class width. The frequency distribution should be clearly labelled, so that it is understandable to the reader, without further description.

The Guide to Techinque for setting up appropriate classes for a frequency distribution is on page 38. An Excel-spreadsheet approach can be demonstrated at this point. Unfortunately, Excel does a very poor job of both frequency distributions and histograms. Most of the problem arises from the fact that Excel uses intervals that exclude the lower limit and include the lower limit. Most statistic textbooks (including this book) use the opposite. The built-in histogram tool in Excel does not do a

good job of deciding the class width, and it is difficult to position values at tick marks underneath the graphs in Excel. The chapter refers to these difficulties directly, and provides a step-by-step guide to improving Excel’s histogram on page 46. Symmetric distribution: The right half of the distribution is a mirror image of the last half.

Skewed to the right (or positively skewed): A distribution in which some unusually high values in the data set destroyed the symmetry.

Skewed to the left (or negatively skewed): A distribution in which some unusually small values in the data set destroy the symmetry.

Demonstrate examples of symmetric and skew distributions.

-3

-2

-1

NORMAL DISTRIBUTION Symmetric A d ju sted G ro ss In co m e 70

Frequency (millions)

60 50 40 30 20 10 0 0 -2 4

2 5 -4 9

5 0 -7 4

7 5 -9 9

1 0 0 -1 2 4 1 2 5 -1 4 9 1 5 0 -1 7 5 1 7 5 -1 9 9

In c o m e ($ 1 0 0 0 s)

Skewed to the right (positively skewed)

Exam S cores 90 80 70

Frequency

60 50 40 30 20 10 0 Belo w 30

30-39

40-49

50-59

60-69

70-79

80-89

90-99

S core

Skewed to the left (negatively skewed) Outlier: a data point that is unusually far from the rest of the data.

Relative frequency: the percentage of total observations falling into a specific class.

Illustrate the relationship of frequency to relative frequency. Rating 3 4 5 6 7

Frequency 2 4 12 24 18 60

Relative Frequency 0.03 0.07 0.20 0.40 0.30 1.00

2.3 Tables, Bar Graphs, and Pie Charts for Qualitative Data Refer back to qualitative and quantitative data preparation of our charts (section 2.2). At this point, one should illustrate the application of data to the preparation of bar charts and pie charts within Excel. As previously mentioned, qualitative data can be and is usually organized into one table which we have called a contingency table. This table often is called a crossclassification table. Contingency tables are tables with more than one column, with data crossclassified by row and column headings. Excel’s Chart Wizard can be used to create bar graphs and pie charts for simple tables (see the Excel instructions on page 63 and page 64 respectively). Bar charts emphasize the relative sizes of the categories in qualitative data. Pie charts emphasize the share of the total represented by each category. Bar charts can also be created to illustrate the relationship between two or more qualitative variables. See the Excel instructions on page 67 dealing with contingency table data.

2.4 Time-Series Graphs Time-series data are sequences of observations made over time. These observations are plotted in the order in which they have occurred, with time along the x-axis. See the Excel instructions on page 72 dealing with time-series data. Time-series data becomes very important in regression analysis (for residuals) later in the course.

2.5 Scatter Diagrams for Paired Quantitative Data Scatter diagram: A display of paired x-y space data points of the form (x,y). Scatter diagram of Monthly Restaurant Spending vs. Monthly Income

Explanatory variable (independent variable): Observed (and sometimes controlled) by the researcher, which is the apparent cause of the change in the response variable. It is customary to graph the explanatory variable on the x-axis.

Response variable (dependent variable): A variable that changes when the explanatory variable changes. It is customary to graph the response variable on the y-axis.

Positive (direct) relationship: Increases in the (independent) variable corresponds to increases in the response (dependent) variable.

Negative (inverse) relationship: Increases in the (independent) variable corresponds to decreases in the response (dependent) variable.

2.6 Misleading and Uninteresting Graphs Another important feature of this chapter is the commentary provided about graphs, once they are created. While the technical details of choosing and creating the appropriate graphs are provided, there is also an emphasis on reflecting about what the graphs tell us. This chapter gives several good graphing suggestions. 1. 2. 3.

4. 5. 6.

By convention, the starting point, also known as the origin, for the x and y axes of any graph is the point (0,0). If a point other than this convention is followed, it must be clearly and visibly indicated on the graph. It is never a good idea to add 3-D effects to a graph. Titles and labels should objectively describe the associated data for a graph. Be aware of misleading or missing titles that can ask you to draw conclusions. Be aware of the missing labels on the axes. Missing data is the trick that is sometimes used to mislead the interpretation of data on the graph. For instance, the omission of outlying points can distort a graph significantly and can lead to false conclusions. Beware of titles on graphs that ask you to draw conclusions. Avoid using such titles. Titles and labels should objectively describe the associated data. Apply the following test to any graph: ask yourself if the graph can be read and understood without any further supporting material or explanation. You should be able to answer "yes" to this question. Always check data sources to make sure that important data have not been omitted from a graph.

Remind students that the primary goal of a graph is to summarize and communicate data clearly. They should be careful not to create distorted images when creating graphs. An example of this would be 3-D effects. Avoid cluttered images when presenting graphical material.

Teaching Tip: I give my students copies of the newspaper and magazines to study examples of reallife graphs to see if they meet the criteria listed above. We study several examples together as a class first. If you go to the Google Search engine and do an image search for any particular type of graph (i.e., bar graph, stacked area graph, line graph), you will find several examples of graphs from all domains that will interest students.

Go to MyStatLab at www.mathxl.com. Encourage students to use MyStatLab. Students can practise the exercises indicated with red as often as they want, and guided solutions will help them find answers step by step. They’ll find a personalized study plan available to them too! Introduce this extremely useful tool early in your course. Discussion Questions: 1.

How does the media create misleading graphs? What impact can misleading graphs have on the general public? One of the common sensationalization of graphs in the popular media occurs when the vertical axis is drawn without starting at 0. Instead, the vertical axis is started slightly below the lowest data point being graphed. As a result, trends look exaggerated and can be misleading. For example, recall the listeria outbreak in Maple Leaf foods in early 2009. Graphs were produced in the newspaper that showed a sharp decline in the value of the Maple Leaf stock. If students look quickly at the graph below from Reuters.com, they may believe that the Maple Leaf stock shares dropped to nearly zero dollars in March 2009 during the height of the listeria crisis. Remind students to look at the vertical scale, and they will see that the lowest stock value was approximately $7.56. Such misleading graphs can have a huge impact on the buying patterns of the general public and general financial decision making about the stock. A buyer of the stock may be tempted to sell the stock thinking it is worthless, when indeed the drop in value was only about $3-4 dollars over the course of the year.

Source: http://www.reuters.com/finance/stocks/overview?symbol=MFI.TO

Why are frequency distributions important? Frequency distributions are important because they help summarize large amounts of data in an easily understandable way. For example, imagine a hospital manager was presented with five columns of quantitative data about five hundred diabetes patients on a new drug. The raw data would not tell him or her very much unless the data was organized in an appropriate way, preferably graphically. How many patients are responding the new drug? Are their blood glucose levels within a healthy range? Frequency distributions are essential, because they are the basis of graphs -- including histograms, frequency polygons, and cumulative frequency polygons. While most of the general public has trouble understanding (or believes that they cannot understand) raw data, most people can have an intuitive sense of the message behind a graph. There is a saying that a “picture is worth a thousand words.” The same is true of data presentation! A graph tells us far more than mindless columns of endless data. A frequency distribution can be constructed for virtually all data sets. They are extremely useful whenever a broad, easily understood description of data concentration and spread is needed. Most secondary data provided by third parties is grouped into a frequency distribution. Magazine and newspaper articles do not present long columns of data. Frequency distributions are important because they summarize data by grouping it into classes, and they tally how many data points fall into each class. They can be used with nominal or interval data. A frequency distribution which tallied the favourite colour of each student in my statistics 101 math class would have discrete classes, because there are a finite number of colours to choose from. The frequency distribution might indicate 12 people selected green, 15 blue, 6 red, 6 yellow, and 11 purple. This type of discrete frequency distribution, based on nominal data, can be represented as a bar graph with the bars not touching each other. The raw frequency counts can be converted into percentages which provide an even more useful description of the sample data. The use of percentages is called “relative frequency counts”, and they allow us to make statements such as 24% of my Statistics class indicated green was their favourite colour (12/50 = 0.24). Frequency distributions can also be used to group continuous data into classes. For example, a frequency distribution of the amount of money people spend per child for Christmas could be created. The classes may be “$0 to under $100”, “$100 to under $200”, “$200 to under $300”, and so on. With this type of frequency distribution, a histogram can be created where the bars touch. The statistican can make statements such as 25% of the parents sampled spent between $100 to under $200 per child for Christmas. Cumulative frequency distributions and frequency polygons can be created, allowing statements to be made such as “58% of the parents sampled spent more than $100 per child” or “35% of parents spent less than $200 on Christmas presents per child”. Frequency distributions are also important because two frequency polygons can be superimposed on the same graph for comparison. For example, the statistican could compare spending in the current shopping year with last year with frequency distributions. The frequency distribution is the foundation of descriptive statistics. It is a basis for many types of graphs that are used to display data, and is the foundation for the summary statistics used to describe a data set. Basic summary measures such as mean, median, mode, variance, standard deviation, and so on can be computed from a frequency distribution for grouped data. Frequency distributions are important because they summarize large amounts of data making trends recognizable; they help us make decisions about skewness and the shape of the distribution; and they are useful for comparisons and for determining the appropriateness of statistical tests.

Chapter 3 Using Numbers to Describe Data

Learning Objectives: 1.

Use conventions about the order of operations, and some summation notation to correctly evaluate, and statistical formulas.

Choose and calculate the appropriate measure of central tendency for a data set.

Choose and calculate the appropriate measurement of variability for a data set.

Choose, calculate, and interpret the appropriate measure of association for a paired data set.

Chapter Outline: 3.1 Some Useful Notation 3.2 Measures of Central Tendency 3.3 Measures of Variability 3.4 Measures of Association

Overview: Chapter 3 contains a fairly standard discussion of measures of central tendency, variability, and association. The discussion emphasizes that the appropriate measures depend on the shape of the distribution (in particular, the degree of skewness), and the type of data. It is important to emphasize to students that they must examine a data set carefully before deciding which measure to use. Focusing on the distribution and the data type should start to develop good habits that will help them later, when they have to decide on the appropriate method for inference.

3.1 Some Useful Notation Inform the students that statistics involves some arithmetic, much of it repetitive. Computers are mainly used, but often arithmetical procedures are performed by hand with the use of a calculator as students learn new techniques or write tests. The chapter contains a section on the order of operations. Many students have not mastered this, or have forgotten it, and this makes it difficult for them to master the formulas that are used in statistics. Many of the examples used in this section are pieces of formulas that will come up later in the book. The section ends with the formula for sample standard deviation (without calling it that). The point of this section is to reduce student anxiety about the mechanics of calculations. When they first encounter standard deviation, they should be less troubled by the calculation, and more ready to understand the usefulness of the measure.

Order of Operations-Mathematics: it is important to remember that the conventions of mathematics and order of operations always apply. 1. 2. 3. 4.

First evaluate anything in brackets Then evaluate anything with an exponent Working from left to right, do any division or multiplication Working from left to right, do any addition or subtraction

Summation Notation: n

Σ xi = Σx = x1 +x2 + x3 + x…. i=1 n = number of observations

i = index number

Go through a number of examples as demonstrated in the text, pages 103 through 105, to refresh the students’ skills.

3.2 Measures of Central Tendency Mean: A measure of central tendency calculated by adding up all the numbers in the data set, and then dividing the sum by a number of numbers. The symbol x-bar, x , is used to represent the sample mean or average. The Greek letter mu (µ) is used to denote the population mean. We often use the sample mean to estimate the population mean. Sum of the values of the n observations

Number of observations in the sample

Sum of the values of the N observations

Number of observations in the population

Deviation from the mean: For each data point, the deviation is the distance of the point from the mean. Denoted by:

( x – x)

One should explain at this time that the sum of the deviations from the mean will be by definition equal to zero.

Σ (x – x ) = 0 Median: the middle value (if there is a unique middle value), or the average of the two middle values (when there is not a unique middle value) in an ordered (from the smallest to largest, or the largest to smallest value) data set.

Demonstrate the determination of the median, where there is a unique middle value and where there are two values in the middle (as noted, the median is the average of the middle two values). Explain to students that where we have highly skewed data, the median will give a more typical value than the mean and is often used in this case as a better measure of central tendency.

Mode: The most frequently occurring value in the data set.

Explain to the students that where we have one group of values occurring with greater frequency than all the others, there exists, a single mode which may be useful as a central tendency indicator. Where we have multiple groups of values occurring at the same frequency, for example three sets of values each occurring four times in a data set, there are multiple modes. In the case of multiple modes, the indication of central tendency would not be useful. If the data have exactly two modes, the data are bimodal. If the data have more than two modes, the data are multimodal. Teaching Tip: Make sure students understand the appropriateness of the measures of central tendency. I use the example of ordering shoes in a shoe store. The manager should not want to order the mean shoe size. He would want to order the most frequently purchased shoe sizes. The mode is most appropriate here. I also use the example of a rundown, abandoned house in a subdivision. Such a house would bring average property values down (thereby giving the data negative or left skewness), so the median resale price would be most appropriate here.

The Guide to Decision Making for choosing a measure of central tendency is described on page 114. Excel instructions for using built-in statistical functions are described throughout the section.

3.3 Measures of Variability Range: A number of that is calculated as the difference between the maximum and minimum values in the data set. It is the simplest measure of variability.

Standard Deviation: A measure of variability in a data set that is based on deviations from the mean. Also referred to as, the positive square root of the variance.

The variance is the average of the squared differences between each data value and the mean.

for a population

for a sample

The standard deviation is computed as follows:

s  s2

 

for a sample

for a population

It is often best for students to break up the calculation for standard deviation into two parts. First, make the determination of the variance, and then, to make the final determination of the standard deviation. The mathematical calculations by hand would follow this format.

An intuitive discussion of deviations from the mean is included. This calculation shows up often in formulas, and the goal is to get the students comfortable with it. In case a student is skeptical about whether the “computational formula” for standard deviation is equivalent to the “definitional formula”, the following demonstration should help. Students may need to be reminded of how to expand a binomial expression of degree 2. To demonstrate the equivalencies of the two formulas, that is,

( x  x )  n 1 2

(x ) 2 n , n 1

x 2 

it is sufficient to show that

 ( x  x ) 2  x 2 

( x ) 2 n

This can be shown by expanding the expression on the left-hand side, as follows:

( x  x ) 2 2

x   x  x      x1   x2      x n   n   n  n     x    x   x  2x 1       n   n 

 x   x   x  2x 2       n   n   

 x   x   x  2x n       n   n 

2 1

2 2

2 n

 x   x   x  2x   n   n   n 

 x 2  2

x 2  n x 2

n2 x 2  x 2  x 2  2 n n 2 x   x 2  n n

Empirical rule: The Empirical rule applies only to a symmetric bell-shaped distribution: 1.

About 68% of the observations will lie within plus and minus one standard deviation of the mean.

About 95% of the observations will lie within plus and minus two standard deviations of the mean.

Almost all (99.7%) will lie within plus and minus three standard deviations of the mean.

To this point, the standard deviation as a measure of variability, with a larger standard deviation indicating greater variability, has been discussed. One can regard the standard deviation as a typical deviation from the mean in a data set. The standard deviation can also be used as a unit of measurement and is very useful in the application of statistical decision-making techniques. The empirical rule assists us in using the standard deviation as a tool for statistical analysis.

-3

-2

-1

NORMAL DISTRIBUTION BELL-SHAPED FREQUENCY DISTRIBUTION Students often ask: but what is a standard deviation? A standard deviation is just a unit of measurement. It tells us how many standard deviations points above or below the mean a particular data value falls. The discussion of the Empirical Rule is provided to accustom students to this, and to help them see the usefulness of the measure for normally-distributed data sets. As well, some students may ask: but what if the x-value is 2½ standard deviations from the mean? This will set up the discussion of the normal distribution and z-scores that comes later in the text. Percentiles are discussed here, but the focus is on their use for the interquartile range (IQR). The differences between the typical manual calculation of the IQR and Excel’s version are explicitly dealt with.

Percentile: A percentile provides information about how the data are spread over the interval from the smallest value to the largest value. The pth percentile of a data set is a value such that at least p percent of the items take on this value or less and at least (100 - p) percent of the items take on this value or more.

Quartiles: Divides a set of observations into four equal parts. First Quartile = 25th percentile, second Quartile = 50th percentile = median, third Quartile = 75th percentile.

Use example 3.3d on page 128 in the text to demonstrate the determination of the 75th percentile also known as the third quartile. Use example 3.3e in the text on page 129 to show the determination of the interquartile range. Interquartile Range: It measures range of the middle 50% of the data values.

The preferred measure of variability for quantitative data, is the standard deviation. However, the standard deviation is significantly affected by outliers. In this case a better measure of variability for skewed data sets is the interquartile range, because the IQR is calculated on the basis of the middle 50% of the data values. Demonstrate the calculation of percentiles, quartiles, and interquartile range.

Guide to Decision Making: Choosing a Measure of Variability 1.

The standard deviation is the preferred measure of variability for quantitative data. However, it can be significantly affected by outliers, and should not be used for badly skewed data.

The interquartile range can be used as a measure of variability for quantitative or ranked data. It is preferred for quantitative data when the data are skewed. When comparing data sets, never compare an IQR calculated by hand with an IQR calculated with Excel, as the methods are not directly comparable.

If the mean is an appropriate measure of central tendency for a data set, use the standard deviation as a measure of variability. If the median is the best measure of central tendency for a data set, use the interquartile range as a measure of variability.

3.4 Measures of Association The discussion of the way the Pearson Correlation Coefficient, r, works is a bit unusual for an introductory text. However, students usually get this notion of summing the deviations from the mean quite easily, particularly with the help of the accompanying PowerPoint slides. The Spearman rank correlation coefficient is included, so that students have an appropriate measure of association when data sets contain outliers, or when one or both of the variables is ranked. Pearson Correlation Coefficient (r): A numerical measure that indicates the strength of the linear relationship between two quantitative variables. The Pearson correlation coefficient can take on values between -1 and +1. A value of -1 signifies a perfect negative linear relationship. A value of +1 signifies a perfect positive relationship. An “r” value of zero (0) indicates no apparent linear relationship. Hence, the relationship is stronger, the further the “r” value is from zero in either direction. Further, a higher value of the Pearson correlation coefficient, r, (close to +1 or close to -1) indicates a strong linear correlation between the x-variable and the y-variable, but it does not prove that changes in the x-variable have caused changes in the y-variable. All we can say is that the variables are highly correlated and any conclusions about causality must be made on the basis of an understanding of the context of the data being analyzed. Calculation of the Pearson correlation coefficient, r, is done in Excel with the Pearson function. The following is one way to determine r illustrating that the computation is based on the deviations from the mean, for both x and y variable values: Pearson Correlation Coefficient:

sx = the standard deviation of the x - values in the sample sy = the standard deviation of the y - values in the sample

Weaknesses with Pearson r: 1. The correlation coefficient r does not identify non-linear correlations. 2. The correlation coefficient r can be greatly affected by outliers, to the extent that it may give a misleading indication of correlation. Spearman Rank Correlation Coefficient: A numerical measure that indicates the strength of the relationship (linear or non-linear) for two variables, one or both of which may be ranked.

The Spearman rank correlation coefficient is usually denoted as “rs”

Provide demonstrations of Pearson and Spearman as appropriate. Guides to Decision Making are provided to help students choose the appropriate measures of central tendency, variability and association, depending on the characteristics of the data. The Guides should prove quite useful to students, but it may be necessary to draw their attention to the Guides. Refer the students to page 142 of the text:

Guide to Decision Making: Choosing a Measure of Association 1.

The Pearson correlation coefficient (r) is the preferred measure of association for quantitative data. However, it can be affected by outliers, and is less reliable when these are present in the data. As well, it is a measure of linear association only.

The Spearman rank correlation coefficient (rs) can be used as a measure of both linear and non-linear association between two variables. It may provide a better indication of the correlation between two quantitative variables when there are outliers in the data. The Spearman rank correlation coefficient can also be used as a measure of association when one or both of the variables are ranked.

Go to MyStatLab at www.mathxl.com. MyStatLab is an online homework, tutorial, and personalized assessment system that accompanies this Pearson Education textbook. Chapter 3 introduces calculations. Encourage students to begin using MyStatLab on a regular basis to assist in their homework.

Discussion Questions: 1.

Why is the empirical rule useful? The empirical rule, otherwise known as the 68-95-99.7 rule, states that for a normal distribution, nearly all data values lie within 3 standard deviations of the mean. This rule is useful for quickly getting a rough estimate of an event’s probability, given its mean and standard deviation, and assuming a population is normal. The empirical rule can also be used as a simple test for outliers (if the population is assumed normal), and as a normality test (if the population is potentially not normal). We can also use the empirical rule to begin a discussion of unusual data values, which forms the basis of hypothesis testing. Anything that is not within 2 standard deviations of the mean is in the outer 5% of the normal bell curve. In other words, if 95% of the data lies within 2 standard deviations of the mean, then only 5% are outside this range, in what are referred to as the “tails” of the distribution. Since the distribution appears to be symmetric, this would occur about 2.5% of the time. If a restaurant finds it is overcrowded only 1% of the time, this would be in the outer upper tail of the bell curve. We can also use the empirical rule for a normality test. If a distribution does not follow the empirical rule, it is mostly likely not a normal distribution (and thus appear skewed). Other techniques would be necessary to work with skewed distributions later in the textbook.

Why is a Spearman rank correlation coefficient sometimes advantageous to use instead of the Pearson correlation coefficient? The Spearman rank correlation coefficient is a numerical measure that indicates the strength and direction of the relationship (linear or non-linear) for two variables, one or both of which may be ranked. It is an approximation to the exact Pearson correlation coefficient computed from the original data. Because it uses ranks, the Spearman rank correlation coefficient is much easier to compute (an advantage). Spearman is an alternative measure of correlation that can identify both linear and nonlinear correlations. As a result, it is not so affected by outliers, which is another advantage. It can be used to calculate a measure of association between two variables and is advantageous to use when one or both of the variables are ranked data.

Chapter 4 Calculating Probabilities

Learning Objectives: 1.

Understand what probability is, and use both classical and relative frequency probability approaches.

Calculate conditional probabilities and use them to test for independence.

Use basic probability rules to calculate "and," "or," and "not" probabilities.

Chapter Outline: 4.1 Sample Spaces and Basic Probabilities 4.2 Conditional Probabilities and the Test for Independence 4.3 “And,” “Or” and “Not” Probabilities

Overview: Chapter 4 is a fairly focused discussion of probability. The material is practical in nature. All of the examples focus on business applications, to help students see the relevance of having some probability skills. The notion of using observed relative frequencies to estimate probabilities is explicitly discussed. The chapter begins with a focus on sample spaces, and the many ways they can be represented. Many students have trouble with probability calculations because they cannot adequately describe or imagine the entire sample space. It will be useful for students to recognize that there is a variety of ways to represent sample spaces (tables, pictures, tree diagrams). It is important for students to recognize that “tree-diagram problems” are not fundamentally different from “contingency-table problems”.

Experiment: Any activity with an uncertain outcome.

Event: One or more outcomes of an experiment.

Sample Space: A complete list or representation, of all the possible outcomes of an experiment.

Probability: A measure of the likelihood of an event occurring.

Classical Definition of Probability: If there are “n” equally likely possible outcomes of an experiment, and “m” of them correspond to the event you are interested in, then the probability of the event is m/n.

Relative Frequency Approach to Probability: Is based on a large number of repeated trials of an experiment. The probability is the relative frequency of an event over a large number of repeated trials.

Sample spaces can be represented with a picture, a listing, a contingency table, a joint probability table, or a tree diagram. Make sure students are provided with sufficient examples and have adequate time to study and apply each different representation.

4.1 Sample Spaces and Basic Probabilities The focus on the sample space provides a way for students to recognize conditional probabilities. Students typically have trouble recognizing the conditional probabilities, and some of this is probably because of the richness of the English language. Students may find it quite helpful to think of a conditional probability as being a case where the sample space is narrowed down. Contingency Table: A table used to classify sample observations according to two or more identifiable characteristics being also referred to as a cross-classification table, table of counts, or joint probabilities.

Relative Frequency: The occurrence of an event over a large number of trials. Often relative frequencies from samples are used to estimate the probability of populations.

Joint Probability Tables: A variation on contingency tables, where the counts are converted to relative frequencies, which can be interpreted as probabilities.

Joint Probability: A probability that measures the likelihood that two or more events will happen concurrently.

Teaching Tip: Students tend to find probability a very confusing and difficult topic to comprehend. Skuce’s approach is to show students that probability is relevant and useful to business students. I follow her approach, and introduce all probability rules through contingency tables in the first few lectures, before moving into the more complicated probability-based situations. I use contingency tables to show students the importance of conditional probability. Students tend to find conditional probability an abstract concept, but it is easily simplified and understood through the contingency table. For example, present students with a contingency table that separates smoking status by gender. Ask students to find the probability that a person is a smoker, given that they are male, i.e., P(Smoker | Male). If you know that the individual is male, students can cross out and ignore the female column in the contingency table. Contingency tables make the concept of conditional probability easier for students to understand. After contingency tables, I move into tree diagrams and conditional probability, independent events, and more complicated probability situations.

Make sure to devote adequate time to tree diagrams (see page 156).

4.2 Conditional Probabilities and the Test for Independence Conditional Probability: The probability of an event A given that another event B has already occurred; the usual notation is P(A│B), read the probability of A, given B.

Independent Events: Two events A and B are independent if the probability of one of the events is unaffected by whether the other event has occurred; P(A│B) = P(B│A) = P(B)

Two events are independent if the probability of one of the events is unaffected by whether the other event has occurred. To see whether to events A and B are independent, one will perform one of the following comparisons (test for independence): Compare P(A│B) with P(A), or Compare P(B│A) with P(B) If the compared probabilities are equal, then events A and B are independent. If the compared probabilities are not equal, then events A and B are not independent (that is, they are related).

The point is made that the test for independence is revealing only for sample data, a point that is glossed over in many books. And/or/not probabilities are discussed, as a necessary prerequisite for how probability tables (such as the binomial) are used. Work through example 4.2B on page 163 to demonstrate the test for independence. It might also be useful to present students with an example of a contingency table that does pass the test for independence.

4.3 “AND,” “OR” and “NOT” Probabilities “AND” Probabilities The “and” probability is one that measures the likelihood that two or more events will happen concurrently, which is an example of joint probability. Joint probabilities can be read in a straightforward way directly from the contingency table. Work through Exhibits 4.19 and 4.20 on pages 165-166 to demonstrate this. There is a relationship between joint probabilities and conditional probabilities. For two events A and B, the conditional probability can be calculated as follows:

P(A│B) =

P(A and B) ─────── P(B)

Similarly we have P(B│A) =

P(A and B) ─────── P(A)

We can rearrange the formula above to state the general rule of multiplication. For two events A and B, the probability of A and B occurring can be calculated using the rule of multiplication: Rule of Multiplication: P(A and B) = P(A) ● P(B│A)

For the special case where the two events are independent, we apply a simplified equation rule of multiplication as follows:

Rule of multiplication —Two Independent Events: P(A and B) = P(A) ● P(B)

“OR” Probabilities For the situation where the event “A or B” occurs, we interpret this to mean: A occurs, B occurs, or both A and B occur. Impress upon the students that in most cases, the term "or" will not be used in the question, but more likely something like “at least one” of the events. The rule of addition is used to determine the probability of A or B occurring. Rule of Addition: P(A or B) = P(A) + P(B) - P(A and B) or expanded P(A or B) = P(A) + P(B) - (P(A) ● P(B│A))

Indicate to the students that the relationship P(A and B) is needed to eliminate duplication; that is, double counting. Mutually Exclusive Events: Events that cannot happen simultaneously; if two events A and B are mutually exclusive, then: P(A and B) = 0

With the special case where we have mutually exclusive events, and we want to know the probability of the least one of two events occurring, one can use a simplified version of the addition rule as follows:

Rule of Addition—Mutually Exclusive Events: P(A or B) = P(A) + P(B) { Remember:

P(A and B) = 0 }

“NOT” Probabilities Complement of an Event A: Everything in the sample space that is not A and is denoted by: _ A or ~A or AC or A′

Advise the students that AC will be the notation for the compliment of A used with this text. The complement rule summarizes the relationship between the probability of an event occurring P(A) and the complement P(AC); that is, the occurrence of the event not happening. Complement Rule: The probability of an event A can be calculated using the complement rule as follows: P(A) = 1 – P(AC)

Tree Diagram: A form of graph that is helpful in organizing calculations that involve several stages. Each segment in the tree is one stage of the problem. The branches of the tree diagram are weighted by probabilities. One applies the complement rule to the tree diagram as one analyzes event occurrence probabilities.

At this point, it is beneficial for the student to have a demonstration of an example applying the complement rule to a tree diagram, as illustrated in the text with Example 4.3c on page 175. Mutually Exclusive Versus Independent Events It is important to impress upon the students that there is a difference between mutually exclusive and independent events. These two conditions are not the same as summarized below. Events A and B are mutually exclusive, if P(A and B) = 0. This means that A and B cannot both happen. Events A and B are independent if P(A) = P(A│B); that is, the fact that B happened does not affect the chances of A happening (or vice versa). The key to successful probability calculations is often careful thinking about what the language means. It is a good idea to explicitly tell students that their success will depend on their language skills more than on their computational skills.

Go to MyStatLab at www.mathxl.com. Students can take sample tests and quizzes on each chapter, and receive personalized feedback. Students often find probability difficult, and MyStatLab can give students the extra practice they need to be successful!

Discussion Questions 1.

What is the difference between mutually exclusive and independent events? When considering the probability of two or more events occurring, one has to decide whether the events are independent or related. It is not uncommon for people to confuse the concepts of mutually exclusive events and independent events. Two events are mutually exclusive if they cannot occur at the same time (i.e., they have no common outcomes). The simplest example is tossing a coin once. There are two outcomes: heads or tails. There is no overlap between the events. If we toss the coin twice, there are four outcomes: {HH, HT, TH, and TT}. There is no overlap between any of the four outcomes. Independent events are defined as two events, such that the outcome of event A has no effect on the outcome of event B. For example, consider tossing a coin twice. The outcome of the first event (heads or tails) has no effect on the second event/toss (heads or teals). Mutually exclusive events are disjoint probabilities and can be added using the general rule of addition because there is no overlap, i.e., P(A or B)=P(A)+P(B). The probability of independent events can be found using the simplified multiplication rule, i.e., P(A and B)=P(A)*P(B). Thus the concepts of mutual exclusion and independent events are different. Independent events are defined as two or more events such that the occurrence of one event has no effect on the probability of the occurrence of any other event. Events A and B are independent if one of the following is true: (1). P(A|B)=P(A); (2) P(B|A)=P(B); (3) P(AandB)=P(A)*P(B). In contrast, mutual exclusive refers to situations in which an observation cannot fall into more than one class (category).

Why is it sometimes easier to use the complement rule to find probabilities? Sometimes it is easier to find the probability of an event's complement than the actual event itself, because it is tedious or impractical to compute the probability directly. Since we know that an event is guaranteed to either happen or not happen, the total sample space is 100%, that is P(A and AC)= 100%. For example, if you know the probability that a customer will buy a product is 45%, the probability that they will not buy the product is 55%. As another example, imagine we wanted to know the probability that at least two vaccums are returned to a department store out of a total of 300 vaccums. There are many mutually exclusive possibilities in this case: the probability that at least 2 vaccums are returned, at least 3 vaccums are returned, at least 4 vaccums are returned, and so forth until the final case of all 300 vaccums being returned. This would require us to evaluate P(X>=2) = P(X=2) + P(X=3) + P(X=4) + … + P(X=300), a total of 299 separate calculations where X is the random variable counting the number of vaccums returned. Instead of evaluating this tedious and impractical calculation, one can find the probability that fewer than 2 vaccums are returned and subtract this result from 100%. If we take 100% - P(fewer than 2 vaccums returned), we will find the desired probability. P(X>=2) = 100% - [P(no vaccums returned) + P(1 vacuum returned)]. In other words, P(X>=2) = 100% - [P(X=0) + P(X=1)]. This general discussion of the complement rule is a nice lead-in to the binomial distribution in the next chapter.

Chapter 5 Probability Distributions Learning Objectives: 1.

Understand the concepts of discrete and continuous random variables and their associated probability distributions.

Recognize situations when the binomial probability distribution applies, and use a formula, Excel, or tables to calculate binomial probabilities.

Recognize the normal probability distributions, and use Excel or tables to calculate normal probabilities.

Chapter Outline: 5.1 Probability Distributions 5.2 The Binomial Probability Distribution 5.3 The Normal Probability Distribution Overview: As with Chapter 4, the focus in Chapter 5 is developing the probability skills required for inference. The beginning of the chapter clearly illustrates how to develop a discrete probability distribution from basic probability rules, so that a clear link to the material in Chapter 4 is established. Only one discrete probability distribution (the binomial) and one continuous (the normal) are discussed. If students can understand the concepts of each of these, it should not be difficult for them to accept and understand other probability distributions (such as the t-distribution) when they come up, later in the book. Basic definitions to bring to the students’ attention follow:

Random Variable: A variable whose value is determined by the outcome of a random experiment.

Discrete Random Variable: A random variable that can take any value from a list of distinct possible values.

Probability Distribution for a Discrete Random Variable: A list of all the possible values of the random variable, and their associated probabilities.

Continuous Random Variable: A random variable that can take on any value from a continuous range.

Probability Distribution of a Continuous Random Variable: Described with the use of a graph or a mathematical formula.

5.1 Probability Distributions It is possible to build a discrete probability distribution with probability rules. Discrete probability distributions can be represented with a table listing all possible values of the random variable and their associated probabilities. The distribution can also be represented graphically, with the possible values of the random variable along the x-axis, and the associated probabilities shown on the y-axis. Most probability distributions can be summarized with:  

Expected values (or a means), which are not measures of the center of the probability distribution. Standard deviations, which are measures of the variability of the probability distribution.

Mean of a discrete probability distribution µ : µ = ∑ (x • P(x)) The mean is a weighted average of the possible values of the random variable x, where the weights are the associated probabilities. The mean of a probability distribution is also referred to as the expected value of the random variable.

Teaching Tip: To illustrate this formula, I begin the discussion with a practical example of tossing a coin three times and counting the number of heads obtained. The random variable (the number of heads obtained in three tosses of the coin) can take on the values of 0, 1, 2, and 3. Students can work out the associated probabilities using common sense and tree diagrams if necessary. Students can then find the expected number of heads in three tosses of the coin using the formula above. They will quickly see that the expected number of heads, or average number of heads obtained in three tosses of the coin, is 1.5 This is expected because we can expect to get heads 50% of the time, since P(Head) = 50%. This practical example is easy for students to comprehend, and leads into the binomial distribution shortcut formulas later in this chapter.

Standard Deviation of a Probability Distribution σ :

This relationship shows the calculation of the standard deviation is based on deviations from the mean. Another version of the relationship, used to calculate the standard deviation by hand, follows:

Standard Deviation of a Probability Distribution σ, Alternative Version of the Formula:

This relationship shows the calculation of the standard deviation calculated by hand.

Once the definitions of the formulae for the determination of the expected value (mean) and the standard deviation has been discussed with the students, one should demonstrate with an example as in the text.

5.2 The Binomial Probability Distribution The binomial probability distribution applies when we are interested in the number of times a particular characteristic turns up. A binomial random variable counts the number of times, one of only two possible outcomes takes place. In a binomial experiment, something is done repeatedly. The repeated actions are referred to as trials. This count that we are interested in is a random variable, because its outcome is determined by chance. When we find what we're looking for, this is called a success, and the other outcome (the complement of success) is a failure. We can think of a success as finding what we are looking for.

Requirements for a Random Variable to be Binomial: 1.

There are only two possible outcomes for each trial of an experiment: success and failure. The probability of success is in a given trial is denoted p. The probability of failure is 1 – p (often denoted as a q).

The binomial random variable is the number of successes in a fixed number of trials (n).

Each trial is independent of every other trial. The probability of success, p, stays constant from trial to trial, as does the probability of failure, q.

Mean of the binomial random variable:

n = number of trials

p = probability of success

Standard deviation of the binomial random variable:

n = number of trials

p = probability of success q = probability of failure

It should be stressed upon the students that most of the useful applications in the binomial distributions are not truly binomial, because sampling is done without replacement. This means that the trials are not independent, and so the experiment is not truly binomial. When the other conditions for a binomial experiment apply, but sampling is done without replacement, the binomial distribution can be used to approximate probabilities, as long as the sample is no more than 5% of the population. Most of the useful applications of the binomial distribution rely on the 5% criteria, because it is rare to encounter a real-life situation where sampling is done with replacement. Opinion polls are an example of an experiment conducted without replacement (a pollster talks with a respondent once, the respondent is not added back into the population pool). Impress upon the students that the sample size in relationship to the population must fall within the 5% criteria in order to support the analysis using the binomial probability distribution. All problems should have a comment that the sample size is within the 5% criteria or by the size of the population can be inferred that the sample size is less than or equal to 5% of the population. This stage of the analysis must be included, specifically where sampling is done without replacement. This allows us to use the binomial probability distribution. Binomial Probabilities with a Formula:

n = number of trials p = probability of success q = 1 – p = probability of failure

Which represents the number of combinations of x items that you can choose from n things. n! is read as n factorial Special case 0! = 1 by definition

Teaching Tip: Most recent high school graduates from the college-level stream of Mathematics in Ontario have no experience with factorials, combinations, or permutations. College statistics teachers must take care to define and show examples of factorial notation. A brief discussion of combinations (and the combination key on the calculator) is also necessary for students to properly understand the binomial formula. Make sure to establish the relationship between combinations and the tree diagram. I begin my discussion of combinations with the simple example of asking students how many ways can I choose any 2 letters from {A,B,C,D}. The answer is 6 since the 2-letter combinations are {AB, AC, AD, BC, BD, and CD}. This gives:

After this basic introduction, I move into tree diagrams and work through an example like Exhibit 5.7 on page 193. Emphasize the repetitive calculations in the tree diagrams for finding the different ways two of the three customers entering a mall will be female. The combinations {FFM, FMF, and MFF} are all identical because order does not matter with combinations (only the fact that two of the three customers are female).

For illustration of the use of the binomial formula, refer to example 5.2a on page 195. For illustration of the calculation of binomial probabilities with use of Excel, refer to example 5.2b on page 197. The Excel function used to calculate binomial probabilities is called BINOMDIST. Example 5.2c on page 182 illustrates how to use tables to calculate binomial probabilities. These tables, in the back of the book, can be used to calculate a limited number of binomial probabilities.

5.3 The Normal Probability Distribution The student must understand that the normal distribution is the most important probability distribution. It is absolutely essential for many types of statistical inference, and it can even be used to approximate other distributions. It applies to many natural and physical situations. Many normal random variables are related to a measurement of some kind. Most continuous random variables have normal distributions. 1. 2. 3. 4. 5.

The normal probability distribution takes the shape of the normal curve. This curve is sometimes referred to as the bell curve, because it is shaped like a bell. The normal distribution is symmetric about the centre or mean of the distribution. If we cut the normal curve at this central value, the two halves will be mirror images. The normal curve falls off smoothly in either direction from the central value. The curve gets closer and closer to the x-axis but never actually touches it; that is, the “tails” of the curve extend infinitely in both directions. With a continuous probability distribution, areas below the curve define probabilities with the total area under the normal curve being 100%.

The normal random variable can take an uncountable number of possible values, which makes it impossible to list all the values and a table with the associated probability, as we could for a binomial distribution. In this case, another approach is needed, but it is very similar to the graphical approach that was used for the binomial distribution. Illustrate the normal distribution, as with the Exhibit 5.17 on page 202 of the text. For a normal random variable: P(x = a particular value) = 0. We can only find probabilities within a certain range of values. Make sure students are aware that when the normal random variable is 4 then: P(x < 4) = P(x ≤ 4) For normal probability distributions, we must determine how much area is under the normal curve. The determination can be done using a formula, Excel or a table. The text concentrates on using Excel and the table, methodology. Calculating Normal Probabilities with Excel: Excel has a number of functions associated with the normal distribution. Two that are described in this chapter are NORMDIST and NORMINV. Example 5.3a on page 207 illustrates the use of NORMDIST. Example 5.3b on page 210 illustrates the use of NORMINV. Calculating Normal Probabilities with a Table: A table supplied at the back of the text can be used to calculate normal probabilities. It is necessary to standardize the probability calculation using a z-score, which translates the location of a particular x in a normal probability distribution into a number of standard deviations from the mean. The z-score can be calculated using the following formula:

The student should be instructed us how to read the normal table. Example 5.3c on page 215 illustrates the use of the normal tables to calculate probabilities. One can use the tables to locate an x value that corresponds to a particular probability. Once you locate the desired probability in the body of the normal table, and the associated z-score, you can calculate the desired x-value with the following formula:

Make a note to the students that there could be a time that the normal probability calculations lead to a z-score that is not included in the tables. Theoretically a normal random variable can be infinitely large or small, but almost 100% of the probability is accounted for with a z-score of 3.99. This is the reason why that table stops at 3.99. Keep in mind that there is only a negligible area under the normal curve beyond a z-score of 3.99 (or - 3.99).

Teaching Tip: Students need sufficient time exploring the normal distribution and working with related probability calculations. I make study guides with students for the different cases that can arise in word problems. Students need to understand the difference between “at least”, “at most”, “between”, “less than”, and “greater than”. I always have students draw diagrams when computing probabilities using the normal distribution; otherwise they are prone to make errors.

Discussion Questions: 1.

What is an expected value of a discrete probability distribution? A random variable is a variable which can take on one of the outcomes of a probability experiment. Random variables are denoted using letters such as X. A discrete random variable is a random variable for which the given set of values is finite or countable. The expected value (mean or average) of a discrete random variable X, denoted by E(X), is defined as the weighted average of probability distribution. The expected value is the most probable value of discrete random variable which may occur in the distribution. Consider the practical probability experiment of tossing a coin three times and counting the number of heads obtained. The random variable (the number of heads obtained in three tosses of the coin) can take on the values of 0, 1, 2, and 3. As mentioned in an earlier teaching tip, the associated probabilities can be found using common sense and using tree diagrams if necessary. Students can then find the expected number of heads in three tosses of the coin using the weighted average formula given in this chapter. They will quickly see that the expected number of heads, or average number of heads obtained in three tosses of the coin, is 1.5 heads. This is expected because we can expect to get heads 50% of the time, since P(Head) = 50%. If the coin is tossed 50 times, students can expect to obtain 25 heads on average.

How can a discrete distribution like the binomial distribution be approximated by a continuous distribution like the normal distribution? What problems arise with this approximation? As long as certain conditions are met, normal distributions can also be used to approximate the distributions of discrete random variables like the binomial distribution, because the shapes of the distributions are similar when the sample size is sufficiently large. The binomial distribution can be used to approximate the normal distribution. However, a problem arises because the binomial distribution is discrete, and the normal distribution is continuous. The basic difference is that with discrete values, there are heights but no widths, and with the continuous distribution there are both heights and widths (when finding areas). For example, P(X=6) can be evaluated for a discrete distribution directly, but not for a continuous distribution. The nearest probability on a continuous distribution for P(X=6) would be an approximation of P(5.5 < x < 6.5). We must apply a correction

factor, which is to either add or subtract 0.5 of a unit from each discrete x-value. This fills in the gaps to make the discrete distribution continuous, and allows the binomial distribution to be approximated by the normal distribution. Skuce does not use the correction factor in calculations in the textbook, but the concept is worth discussing with students. The distinction between discrete and continuous distributions must be made clear for students. This concept arises later in the discussion of sample sizes. Skuce uses a packaging example. For example, x might be the number of packages dropped off for next-day delivery at a courier service. Obviously, x is not a continuous random variable, because it is impossible to drop off 1.5 or 5.67 packages. However, the probability graph for the number of packages dropped off might be very close to a normal distribution (see Exhibit 5.20 on page 205). Therefore, students should not be surprised to see the normal distribution being used to calculate probabilities for some discrete random variables.

Chapter 6 Using Sampling Distributions to Make Decisions Learning Objectives: 1.

Understand how probability calculations and sampling distributions can be used to make conclusions about populations on the basis of sample results.

Infer whether a population mean is as claimed or desired, on the basis of a particular sample mean, using the appropriate sampling distribution and probability calculations, when the population standard deviation, σ, is known.

Infer whether a population proportion is as claimed or desired, on the basis of a particular sample proportion, using the appropriate sampling distribution probability calculations.

Chapter Outline: 6.1 The Decision-Making Process for Statistical Inference 6.2 The Sampling Distribution of the Sample Mean 6.3 The Sampling Distribution of the Sample Proportion 6.4 Hypothesis Testing Overview: Chapter 6 departs from the usual approach to sampling distributions, because it focuses on the usefulness of the sampling distribution for inference, before it gets into the mechanics of the relationship between the sampling distribution and the population distribution. The approach for making decisions (assuming the sampling distribution characteristics are known) is discussed in the first section. This should acquaint the student with the reasoning involved: that a claim is believed unless the sample result is so unlikely that it casts doubt on the claim. This is a fairly simple idea, but one that students often fail to grasp when they are distracted by trying to figure out the calculations and notation for sampling distributions. The ideas of Type I and Type II error are conveyed (without the sometimes confusing language). Implicitly, a p-value approach is used (again, without the sometimes confusing language). It is extremely important that students get comfortable with and develop some intuition about the way that statistical inferences are made. The reasoning is general, and if they can master this, they should be able to apply it to any hypothesis testing situation they encounter. Note that the first set of questions refers to sample results that are proportions or means. Since the details of the sampling distribution are given, the student is free to focus on whether the sample result is rare enough to cast doubt on the claim being made about the population parameter.

Teaching Tip: Spend time in class discussing the way that statistical inferences are made in the real world, and the importance of statistical decision making. When introducing the topic of statistical decision making, I make reference to current events and statistical significance in the media. As an example, consider that recently the Canadian government had to withdraw a batch of the H1N1 Swine Flu vaccine, because more people than normal were having an allergic reaction to this particular batch than from other previous batches. How did Health Canada determine that more people than normal were having an allergic reaction? The sample result must have been highly unusual as compared to expected results. Ask students to brainstorm other situations where similar decisions are made based on statistical inference. These might include deciding whether to adjust a bottling line in a cola factory, whether someone has lost a significant amount of weight on a diet program, whether male children are taking longer to learn to read than female children, etc. Students need to understand the importance of the statistical decision-making process.

6.1 The Decision-Making Process for Statistical Inference Statistical inference is a set of techniques to allow reliable conclusions to be drawn about the population data, on the basis of sample data. A random sample is drawn from the population, and a sample statistic is calculated. While the sample statistic is unlikely to be exactly equal to the claimed or desired value of the population parameter, it should be reasonably close. If it is not, the sample gives us the evidence to doubt that the population is as claimed or desired. Sampling Distribution: The probability distribution of all possible sample results for a given sample size.

The sampling distribution is the key to the ability to distinguish whether sample results are consistent with the claimed or desired value of the population parameter. A sampling distribution is the probability distribution of all sample results for a given sample size. We use the sampling distribution to calculate the probability of getting a sample result as extreme as the one we observed from the sample, assuming the claim about the population is true. For the purposes of this chapter, we used a general rule of declaring a sample result unusual if the associated tail probability was 5% or less.

When is a Sample Result Unusual? We calculate the probability of the sample result (SR) as follows: 

If this sample result is higher than expected, calculate P(SR ≥ the observed value)



If this sample result is lower than expected, calculate P(SR ≤ the observed value)

Where P = probability and SR = sample result We conclude that the population parameter is not as claimed or desired only if the probability of such an extreme sample result is 5% or less.

Often results are discussed in the following manner: 1. 2.

The sample result is not unusual, and there is insufficient evidence that the population parameter is not as desired. The sample result is unusual, and there is sufficient evidence that the population parameter is not as desired.

The sampling distribution is closely related to the distribution of the population from which the sample was drawn. Once we know what the population characteristics are supposed to be, we can calculate the characteristics of the sampling distribution. Two particular sampling distributions are described in Chapter 6: the sampling distribution of the sample mean, and the sampling distribution of the sample proportion.

6.2 The Sampling Distribution of the Sample Mean The sampling distribution of the sample mean is described, assuming  is known. An empirical examination of the relationship between the population distribution and the sampling distribution is included, to help students develop their intuition about this. At the end of section 6.1, students are alerted to the fact that it is impossible to actually know the population standard deviation. There are times that we want to make decisions about the population mean, µ, on the basis of a sample mean, x (x-bar). For a sample size of n and a population with a claimed or desired mean of µ: The standard deviation of the sample means (the x -values):

x 



 x is also referred to as the standard error

= the standard deviation of the x values

As the sample size (n) gets larger, standard error gets smaller.

The mean of the sample means (the x -values) is equal to the mean of the original population:

x = µ  x = the mean of the x -values

µ = the mean of the original x- values Normal Distribution consideration for analysis: The sampling distribution of the sample means ( x ’s) will be normally distributed, if the original population of x’s is normally distributed, or the sample sizes large enough.

Teaching Tip: Students can empirically explore the above consequences of the Central Limit Theorem using Exhibit 6.10 on page 238. Ask each student to take a random sample of nine paint cans from the data provided, and compute the sample mean. As an instructor, record each sample mean in a spreadsheet. Use Excel to create a histogram of the original data and of the sample means. Ask students to comment on the shape of the resulting sampling distribution of sample means. How does the sampling distribution of sample means compare to the original population? You can repeat this exercise with students using a sample size of 25. Ask students to compare results between both sample sizes (n=9 and n=25). Students can compare their results to those found in Exhibit 6.12 on page 239.

The Central Limit Theorem is discussed, with the advice that normality of the population (tested by creating a histogram of the sample data) is less important as sample size increases. Some guidelines for assessing the normality of the sampling distribution of x are included on page 241. The Central Limit Theorem: If all samples of a specified size are selected for any population, the sampling distribution of the sample means is approximately a normal distribution. This approximation improves with larger samples.

Teaching Idea: The Central Limit Theorem can be illustrated to students through the use of applets available on the Internet (web address provided below), or through a small numerical example. Students often have trouble grasping that the sampling distribution of sample means will be normal when all samples of a specified size are taken, even if the original population is skewed. To illustrate this concept, teachers can work with a small population to show how the original population may not be normal, but the sampling distribution of sample means is approximately normal if all samples of a particular size are taken. Here is a small numerical example. Consider the following population of students which shows the number of hours each student spent studying for a Statistics test.

Students

Hours Spent Studying

1. Nicky

2. Will

3. Jeff

4. Diana

5. Mohab

We can create a discrete frequency distribution and histogram for this population as follows.

Hours Spent Studying

Frequency

The original population is clearly not a normal distribution. It is bimodal.

If we take all possible subsamples of size two, there will be 10 such samples, since 5C2 = 10. (Use the combination, nCr, button on the calculator.)

Combination

Sum

Sample Mean

1,2

1,3

1,4

4.5

1,5

2,3

2,4

7.5

2,5

3,4

4.5

3,5

4,5

7.5

We can create a discrete frequency distribution and histogram for the sampling distribution of the sample means as follows:

Hours Spent Studying

Frequency

4.5

7.5

We can now graph the sampling distribution of the sample means as a histogram:

Students can clearly see that the resulting histogram of the sample means is a normal distribution. The resulting shape of the histogram is much different from the original population, which was bimodal. Remind students that the sampling distribution of sample means will be approximately normal when all possible samples are selected from a particular population, and the population is sufficiently large. Other examples of small populations may not work as well as the example above. Remind students that the approximation improves as the sample size increases. This example illustrates that the sampling distribution of the sample means will have a different shape from the original population when all possible subsamples of a given size are taken. Have students confirm that the mean of the original population and mean of the sample means is equal (when all possible samples are taken), i.e.,  x = µ . Examples with larger populations can be found online in applets at: http://www.intuitor.com/statistics/CentralLim.html.

The central limit theorem tells us that sampling distributions of x can be normal, even if the original population is not normal. The amount of units in the sample size depends on how far from normal that the population is. To assess if the population is normal, create a histogram of the sample data (unless there are too few data points). Excel can be useful for this. The following box has some guidelines to assist in deciding whether a sample distribution of x is normal. The criterion thatthe population being approximately normal is important to allow us to use techniques illustrated in this section. Assessing Population Normality: 1. 2.

If there are outliers in the sample, you should proceed with caution, no matter what the sample size. If there is more than one mode in the sample data, you should proceed thoughtfully. While large enough sample sizes could result in a normal sampling distribution of x , you should investigate whether your data might be coming from more than one population. With small sample sizes, less than about 15 or 20, the population data must be normal for the sampling distribution to be normal. The histogram of sample data should be a normal shape, with one central mode and no skewness. With sample sizes is in the range of about 15 or 20 to 40, the histogram of sample data should have a normal shape, with one central mode and not much skewness. If these conditions are met, the sampling distribution will likely be normal. With sample sizes above 40 or so, the sampling distribution will probably still be normal, even if the sample histogram is somewhat skewed.

Teaching Tip: To illustrate the above guidelines during a lecture or lab class, pre-select different populations available on the CDROM or in MyStatLab. Assess the population normality of each using the criteria above. This activity could be instructor-led and done as a class, or could be completed by small groups of students. Each small group could explore a population and assess its normality, and present their findings to the class.

Guide to Decision Making: Using the Sampling Distribution of x to Decide about µ when σ is Known Situation:  Quantitative data, one sample, one population  Trying to make a decision about µ on the basis of x  σ given Steps: 1.

   

Identify or calculate x , the sample mean. σ, the population standard deviation µ, the desired or claimed population mean n, the sample size

Check for normality of the sampling distribution, by assessing the normality of the population (usually with a histogram of the sample values).

If the sampling distribution is likely to be normal, proceed by identifying or calculating the mean and standard deviation of the sampling distribution, using the following formulas:

x = µ  x  n

Use this sampling distribution to calculate the probability of a sample result as extreme as the x from the sample.  

If x is above µ, calculate P ( x ≥ observed sample mean) If x is below µ, calculate P ( x ≤ observed sample mean) If the calculated probability is 5% or less, there is convincing evidence that the population is not as claimed or desired.

6.3 The Sampling Distribution of the Sample Proportion The discussion of the sampling distribution of the sample proportion begins with a discussion of how decisions about population proportions can be made using the binomial distribution. When computers are widely available, it is easily possible to use such an approach, and particularly when sample sizes are small, it is more accurate. The sampling distribution of p̂ is also discussed. Sometimes you want to make decisions about a population proportion, p, on the basis of the sample proportion, p̂ . We can do this by two different methods, depending on whether we focus on counts (the number of successes) or the proportion of successes. Example 6.3a on page 245 illustrates using the binomial distribution to make a decision about a population proportion. Example 6.3b on page 250 illustrates using the sampling distribution of p̂ to make a decision about a population proportion. These are just two ways of representing the same information. No matter what method is used, if sampling is done with a replacement, ensure that sample is no more than 5% of the total population, so that the binomial distribution is the appropriate underlying probability model. If we focus on the counts (the number of successes) rather than the proportion of successes, we can calculate the probability of a sample result as extreme as the one we observed using the binomial probability distribution. Generally, this will require the use of a computer. The approach is the same as before: if the probability of getting a sample result as extreme as the one we observed is 5% or less, we consider that to be the evidence that the population proportion is not as claimed. Example 6.3a on page 245 illustrates this. If we focus on the proportion of successes, we can (in some cases) use a sampling distribution of p̂ to assess the probability of a sample result as extreme as to when we observed. For a sample size of n and a binomial distribution with a claimed or desired proportion p: The standard deviation of the sample proportions (the p̂ - values):

is referred to as the standard error of the sample proportion b f t i l i th l

The mean of the sample proportions is equal to the population proportion:

µ p̂

= p

µ p̂ = the mean of the sample proportions p = the population proportion

The sampling distribution of p̂ continues to be discussed. Conditions for its use are that np and nq are at least 10. The reason for this is as follows: The number of successes in a binomial experiment can range from 0 to n. The majority of data in a normal distribution is located within three standard deviations below and three standard deviations above the mean (approximately 99.7% of the population according to the Empirical rule discussed in Chapter 3). So, if a binomial distribution is to be approximately normal, then the mean must be three standard deviations above zero, and three standard deviations below n. This requires that:

mean  3 standard deviations  0 np - 3 npq  0 np  3 npq n 2 p 2  9(npq) np  9q since 0  q  1, this requires np  9 mean  3 standard deviations  n np  3 npq  n 3 npq  n  np 3 npq  n (1  p) 3 npq  nq 9npq  n 2 q 2 9p  nq, rewriting nq  9p since 0  p  1, this requires nq  9 In the interests of using a nice round number, the requirement is that np and nq  10. This requirement handles more extreme cases where p or q is close to 1. When p or q is closer to 0.5, this is more than is necessary. However, the more stringent requirement here also means that the normal approximation to the binomial (on which the sampling distribution of p̂ relies) will be more accurate, as it will generally require larger sample sizes (where the continuity correction would not be so significant).

Normal Distribution Consideration for Analysis: The sampling distribution will be approximately normal as long as both of the following required considerations are met:

np ≥ 10 and nq ≥ 10 p = probability of success q = (1 – p) = probability of failure n = number of trials These conditions must be checked before you use the sampling distribution of p̂ . If these conditions are not both met, you will have to use the binomial distribution to make your decision.

Example 6.3b on page 250 should be used to illustrate this situation.

Teaching Tip: To illustrate the above requirements, refer to Exhibit 6.19 on page 249. It illustrates that the sampling distribution of the sample proportion n=200, p=0.05 is approximately normal because the conditions above are met. Students could create the sampling distribution of the sample proportion for n=20, p=0.05 to see that the resulting distribution is not normal because the above conditions are not met. This is not a proof of the theorem, but it gives students some empirical evidence that the above conditions must be satisfied for the sampling distribution of the sample proportion to appear normal.

Correction Factors: Approximating a discrete probability distribution with a continuous one, the straightforward calculation using p̂ will tend to underestimate the true probability. This bias is corrected to the use of something called the continuity correction factor, which is beyond the scope of this text. However, the continuity correction factor is not so important for larger values of n, and that is the situation that will be assumed throughout the text exercises. A Guide to Decision Making, using the sampling distribution of p̂ is located starting on page 252 and also follows in this guide. See the next page in this instructor’s manual.

Guide to Decision Making: Using the Sampling Distribution of p̂ to Decide About p. Situation:    

Quantitative data, one sample, one population Trying to make a decision about p on the basis of p̂ sample size (n) large computer not available, so using a binomial probability calculation is not practical Steps: 1.   

 

Identify or calculate p, the desired or claimed population proportion p̂ , the sample proportion n, the sample size

If sampling is being done without replacement, check that the sample is less than 5% of the total population, so the binomial distribution is an appropriate underlying model.

Check that the underlying binomial distribution can be approximated by a normal distribution, which requires both of the following (if these conditions are met, the sampling distribution of p̂ will be approximately normal ) : np ≥ 10 and nq ≥ 10

If this sampling distribution is approximately normal, proceed by identifying or calculating the mean and standard deviation of the sampling distributions, using the following formulas:

Use this sampling distribution to calculate the probability of a sample result as extreme as the p̂ from the sample.

If p̂ is above p, calculate P (p ≥ observed sample proportion) If p̂ is below p, calculate P (p ≤ observed sample proportion).

If the calculated probability is 5% or less, there is convincing evidence that the population proportion is not as claimed.

6.4 Hypothesis Testing The end of chapter 6 (section 6.4) alerts the student that there is more to hypothesis testing than is presented in chapter 6, and sets up some of the details that will be presented in the next chapter. Only a few end-of-chapter problems are presented, so that the students do not get too accustomed to the informal approach, before they learn how a formal hypothesis test is conducted.

Discussion Questions: 1.

What is the Central Limit Theorem, and why is it important for hypothesis testing? The central limit theorem states that when an infinite number of repeated random samples are taken from a population, the distribution of sample means calculated for each sample will become approximately normally distributed with mean μ and standard deviation as the sample size (n) becomes larger, irrespective of the shape of the original population distribution. As more and more random samples are taken from the population, the distribution of sample means becomes more normally distributed and looks smoother. Distributions of sample means are used in all hypothesis tests with means. Understanding the Central Limit Theorem is necessary to understand hypothesis tests of means. The Central Limit Theorem is the foundation for all tests of means. It gives us a set of rules for determining the mean, variance, standard deviation, and shape of the distribution of sample means. Without the Central Limit Theorem, we would not be able to make inferences about populations that are not normal. The approach to statistical decision making will be formalized in subsequent chapters on hypothesis testing.

Why can a discrete distribution like the binomial distribution be approximated with a continuous distribution such as the normal distribution? The normal distribution can be used to approximate the binomial distribution when certain conditions are met. The appropriateness of the normal distribution approximation to the binomial distribution depends on the binomial parameters. In general, the larger the value of n and the closer p (the probability of success) is to 0.5, the better the approximation. One distribution can be used to approximate another distribution when their characteristics are fairly similar. The two distributions must have the same mean, the same variance, and a similar shape. When approximating the binomial distribution by the normal distribution, we must ensure that their means are equal by setting µ (the mean of the normal distribution) equal to the value of np (the mean of the binomial distribution being approximated). This ensures both distributions have the same mean. We must also set the variance of both distributions equal. This can be accomplished by setting σ2=npq. As n becomes larger and larger, the shape of the binomial distribution becomes more and more like the normal distribution. Thus, if n is large, and the conditions np ≥ 10 and nq ≥ 10 are met, the binomial distribution can be approximated by the normal distribution. A correction factor can be applied to the approximation, because the binomial distribution is discrete and the normal distribution is continuous. This continuity correction factor is beyond the scope of this Statistics textbook.

Chapter 7 Making Decisions with a Single Sample Learning Objectives: 1.

Set up appropriate null and alternative hypotheses, and make appropriate conclusions by comparing p-values with significance levels.

Use formal hypothesis tests to make appropriate conclusions about population proportions, on the basis of a single sample.

Use a formal hypothesis test to make appropriate conclusions about a population mean, on the basis of a single sample.

Chapter Outline: 7.1 Formal Hypothesis Testing 7.2 Deciding about a Population Proportion−−the z-test of p 7.3 Deciding about the Population Mean−−the t-test of µ Overview: Chapter 7 introduces the elements of formal hypothesis testing. The introductory part of this chapter further develops the concepts discussed in Chapter 6, and introduces the terminology required. The focus is on p-values, for which the groundwork was laid in Chapter 6. The p-value approach is preferred to the rejection region approach, for two reasons: • p-values provide more information • p-values are normally reported in statistical software. Students can learn to estimate p-values from tables quite easily. This is discussed in some detail in the text, and the PowerPoint slides that accompany the text can also help. Once the students master estimation of p-values from the tables, there is no reason to use the rejection region approach to hypothesis testing. For consistency and focus, and to eliminate the confusion that sometimes arises when two different methods to arrive at a conclusion are discussed, the text focuses only on the pvalue approach. However, every hypothesis test indicates a significance level, so the problems in the book easily lend themselves to the rejection region approach, if that is preferred for some reason. This chapter begins with decisions about population proportions. The advantage of this is that students can move directly into formal hypothesis testing with the normal distribution, with which they are already familiar. This chapter illustrates the first of several Excel templates that are used throughout the book. These templates automate, in Excel, the calculations that would be done manually. Students should be able to see a direct relationship between the calculations they do by hand, and the template results, and it is a good idea to do some problems both ways, to reassure them about how the templates work. Note that the templates require the user to explicitly check required conditions before proceeding. The templates also often require some initial work with standard Excel features. The goal was to make the Excel work completely transparent to the student.

7.1 Formal Hypothesis Testing In a formal hypothesis test there are always two hypotheses: the null hypothesis and the alternative hypothesis. Null Hypothesis (H0): What you are going to believe about the population unless the sample gives you strongly contradictory evidence.

Alternative Hypothesis (H1): What you are going to believe about the population when there is strong evidence against the null hypothesis

Important characteristics about the null and alternative hypotheses: 1.

2. 3.

The hypotheses are statements about a population parameter (they will never be about sample statistics). If we have sample data, we can calculate the sample statistic – no hypothesis is required. The hypotheses match, in the sense that if the null hypothesis (H0) is µ = #, the alternative hypothesis (H1) will be some other statement about µ. The null hypothesis (H0) always contains an equality. The alternative hypothesis (H1) never contains an equality (or ≤ or ≥).

Alternative Hypothesis (H1) Forms: 1. 2. 3.

H1: population parameter > some particular value H1: population parameter < some particular value H1: population parameter ≠ some particular value

The first two forms are referred to as one-tailed tests, because sample results fall in only one tail of the sampling distribution to provide evidence against the null hypothesis, in favour of the alternative hypothesis. The third option is described as a two-tailed test of the hypothesis, because both sample results are significantly too low (from what should be the left tail of the sampling distribution) and significantly too high (from the right tail of the sampling distribution) and would provide evidence against the null hypothesis.

Teaching Tip: Students need adequate time to understand the formal hypothesis testing procedure before beginning actual calculations. I give my students several general situations without bogging them down with numbers first, and ask them to state the null and alternative hypothesis and to discuss their feelings about statistical significance. Here is a sample situation: “Kelsey’s advertises that the mean waiting time for lunch to be served to customers is 15 minutes. You visited Kelsey’s over the past six months, and noticed that the mean waiting time is closer to 12 minutes.” I try to use real-life examples that require a statistician to base a conclusion on statistical significance. Recently, the Canadian government pulled a batch of Swine Flu vaccine because more people than normal in the sample were having reactions. I ask students to form the null and alternative hypotheses in this case. I also use drug studies as another example of general situations. Students feel less intimidated when they have the opportunity to explore the language and concepts before beginning calculations.

Making a Decision: Rejecting or Failing to Reject the Hypothesis Usual Language Where there is convincing evidence against null hypothesis, we can say we rejected the null hypothesis, and there is sufficient evidence to support the alternative hypothesis. When the evidence is weak, we say that we failed to reject the null hypothesis, and there is insufficient evidence to support the alternative hypothesis. Remember, it does not imply that the null hypothesis is true. One cannot say that the null hypothesis is true, or even that you accept the null hypothesis. All we can say is that we do not have strong evidence against it. We can never be that definitive unless we have examined the entire population data. Significance level and Type I and Type II Errors

Type I Error: Error that arises when we mistakenly reject the null hypothesis when it is in fact true.

Type II Error: Error that arises when we mistakenly fail to reject the null hypothesis when it is in fact false.

Significance Level (α): The probability of a Type I error in a hypothesis test.

Once the null and alternative hypotheses are set up, the next step is to establish a significance level of the test. For example, a significance level can be alpha = α = 0.05, which means that the significance level is set at 5%. When making a hypothesis test, one can make one of two errors. Both of these potential errors arise because we are using sample data to make a conclusion about a population. The best we can do is to control the possibility of these kinds of errors. We cannot eliminate these errors completely, as long as we're using sample data (and not population data) to make decisions. Both kinds of errors have costs and other consequences. The way to set the significance level is to think about the costs and consequences of both types of errors, and then set α to the maximum tolerable level, which keeps the probability of a Type II error as low as possible. (Type II, β, error calculations are beyond the scope of this text).

p-value: In a hypothesis test, the probability of getting a sample result at least as extreme as the observed sample result; the probability calculation is based on the sampling distribution that would exist if the null hypothesis were true.

Using p-values in Hypothesis tests: We reject the null hypothesis whenever the p-value ≤ α. The lower the pvalue is, the stronger the evidence against H0.

Once you have sample data, you will use it and the sampling distribution to determine whether the sample result gives you convincing evidence against the null hypothesis. This requires doing a probability calculation, based on the sampling distribution. This probability calculation results in what it is called the p-value of a hypothesis test, which is achieving a probability of getting a sample result at least as extreme as the observed sample result. The probability is always based on the assumption that the null hypothesis is true.

The p-value probability calculation must match the alternative hypothesis used in the test. It is:   

P(sample statistic ≥ observed sample result) when H1 contains “>” P(sample statistic ≤ observed sample result) when H1 contains “<” 2 x P(tail area beyond the observed sample results) when H1 contains “≠”

A low p-value means that the sample result would be highly unlikely (if the null hypothesis were true). This gives us strong evidence against the null hypothesis. If the p-value is high, then the result is not unusual, and we do not have enough evidence to reject the null hypothesis. We reject the null hypothesis, whenever that the p-value is less than α, the significance level of the test. The lower of the p-value, the stronger is the evidence against H0.

Steps in a Formal Hypothesis Test: 1. 2. 3. 4. 5.

Specify H0, the null hypothesis Specify H1, the alternative hypothesis Determine or identify α, the significance level Collect or identify the sample data. Calculate the p-value of the sample result. If p-value < α, reject H0 and conclude that there is sufficient evidence to decide in favor of H1. If p-value > α, fail to reject H0 and conclude that there is insufficient evidence to decide in favor of H1.

Teaching Tip: This simple saying can help students remember the fifth step in the formal hypothesis testing procedure: “If the p is high, the null will fly. If the p is low, the null must go!” At the 0.05 level of significance, it can be restated as “If the p is high (>.05) the null will fly -- if the p is low (<.05) the null must go!" I have also seen it stated as, “If p is low, the null must go; If p is high, the null doesn't lie” or “if p is low the null must go and the null is dull.” A “dull null” means the null hypothesis is no longer correct and has no effect.

Use examples to demonstrate the determination of p-value and the steps in a formal hypothesis test. Left-Tailed Test, 0.05 Level of Significance Sampling Distribution

Right-Tailed Test, 0.05 Level of Significance Sampling Distribution

Two-Tailed Test, 0.05 Level of Significance Regions of Rejection and Non- rejection

Make sure to spend time in class helping students identify whether the hypothesis test is one tailed or two tailed. 7.2 Deciding about a Population Proportion−−the z-test of p Hypothesis tests about the population proportions p, are illustrated in examples 7.2a (page 269) and 7.2b (page 271) applies the formal hypothesis testing format to a chapter review exercise in Chapter 6. Example 7.2b illustrates how to handle coded data with Excel, and illustrates the use of an Excel template for making decisions about the population proportion with a single sample (see page 273). The first Guide to Decision Making for a hypothesis test is presented on page 274. These Guides should be very helpful to remind students of the steps involved. Each guide indicates the type of problem and data, and what conditions should be checked. The guides also provide details about the sampling distribution and the test statistic.

Calculate the p-value of the sample result: Determine the z-value using:

Determine p-value taking care to realize if one is using a one-tailed (left or right 1 – z probability) test or two-tailed (0.500 – z probability times two) test.

7.3 Deciding about the Population Mean−−the t-test of µ We generally do not know the population standard deviation σ, and have to estimate it with the sample standard deviation s. To take care of the resulting extra variability in the sampling distribution, we must use the t-distribution for hypothesis tests about the population mean. The tdistribution is actually a family of distributions, distinguished by their degrees of freedom, which are equal to n – 1 for test about the mean. The t-distributions are symmetric probability distributions, which look similar to the normal distribution, but with more area on the details. After students get a chance to develop their skills with formal hypothesis tests about the population proportion, they are introduced to hypothesis tests about population means. The t-distribution is introduced right away, so a student never does a formal hypothesis test about a mean with a normal distribution (which is never, strictly speaking, correct, because  is never known). Students who learn to do hypothesis tests about  with the normal distribution often have trouble losing the habit of using z-scores, and the approach used here should eliminate that difficulty. Estimation of pvalues from the t-table is dealt with in some detail.

Standardized t-score:

Examples 7.3a (page 277) and 7.3b (page 281) illustrate the use of Excel to conduct hypothesis tests about µ. Example 7.3b also illustrates the use of an Excel template for making decisions about the population mean with a single sample. It is also possible to estimate p-values from a t-table of critical values. Examples 7.3c (page 285) illustrates estimation of p-values. Example 7.3d (page 287) shows a two-tailed hypothesis test about a population mean. A Guide to Decision Making for a hypothesis test about a population mean is on page 288. Use of the t-distribution is dependent on the population being normal. We check this by creating a histogram of the sample data, checking to see if it looks normal. The t-distribution is robust, that is,

it is not strongly affected by non-normality of the data. However, the t-distribution should not be relied on if there are outliers in the data, or strong skewness. The following characteristics of the t distribution are based on the assumption that the population of interest is normal, or nearly normal. o o o

It is, like the z distribution, a continuous distribution. It is, like the z distribution, Bell-shaped and symmetrical. There is not one t distribution, but rather a "family" of t distributions. All t distributions have a mean of zero, but their standard deviations differ according to the sample size, n. There is a t distribution for a sample size of 20, another for a sample size of 22, and so on. The standard deviation for a t distribution with five observations is larger than that of a t distribution with 20 observations. Note that the t distribution is flatter, more spread out, than the standard normal distribution. This is because the standard deviation of the t distribution is larger than that of the standard normal distribution.

A standard normal distribution comparison of the z and t distributions

Illustrated are values of the z and t distributions for a 95% level of confidence

You should not make important decisions about means with sample sizes of less than 15 or 20, unless you have very good reason to believe that the population is normally distributed. When sample sizes become as large as 40 or more, reliable decisions can be made, even if the data are somewhat skewed. But there is no magic about these numbers, and the general rule will always be that decisions will be more reliable with larger sample sizes. Some specific guidance about normality is provided (“How Normal is Normal Enough?”, on page 289). Teaching Tip: I ask my students, “If I take a survey and I find 100% of those sampled like Britney Spears, can I conclude everyone likes Britney Spears?” I wait for student reactions, and then I tell them that I only sampled two students. Students must take caution dealing with small samples. Such ridiculous claims can be found in advertising where sample sizes are too small or not indicated.

Go to MyStatLab at www.mathxl.com. Encourage students to use MyStatLab. There are multiple practice tests and quizzes per chapter. Students can receive individualized feedback. There is a built-in study guide and calendar feature!

Discussion Questions: 1.

Why is the p-value important in hypothesis testing? Hypothesis testing forces us to determine if a relationship is real or just due to chance or fluke. To make the determination, the researcher has to know how right they are in judging the relationship. This is called the researcher’s confidence level, and it is usually 95% . A confidence level of 95% corresponds to an alpha (α) risk of 0.05. Alpha corresponds to risk of a type I error in hypothesis testing, which is the risk of rejecting the null hypothesis when the null hypothesis is indeed true. For example, imagine researchers were studying whether a particular cancer drug on the market was safe. If the drug is widely available in hospitals, then Health Canada has perceived it safe and the null hypothesis would be that the drug is safe. If some alternate evidence surfaces that the drug causes complications in some patients, this is the alternative hypothesis. If Health Canada removes the drug from the market, when indeed the drug was safe and the complications were by chance or fluke, then the researcher has committed a Type I error. A safe drug was pulled from the market, when it was safe for patients. The p-value is the chance of a Type I error. If the p-value with α risk is <0.05, then the difference is statistically significant. This means that we can reject the null hypothesis with a 95% level of confidence. It should be noted, from a researcher’s viewpoint, that we can never accept or prove a null hypothesis. We can only fail to reject the null based on certain probability. We can never be 100% certain of our conclusion unless we correctly sample every element of the population, and this is typically impossible. The p-value is important because it determines the outcome (conclusion or decision) of a hypothesis test. If the p is high, the null will fly. If the p is low, the null must go. The lower the p-value, the more confident the researcher is in rejecting the null hypothesis and minimizing the chance of committing a Type I error.

What is statistical significance? The government may claim that the yearly deficit has been reduced in their upcoming budget. On closer inspection, one might find that the deficit has been reduced by $100,000 in a $4.5 billion dollar budget. Is this a statistically significant reduction in the deficit, or is the $100,000 reduction simply due to chance or normal budget fluctuations in the cost of items? Governments may claim that the poverty levels in their countries are reduced, but if the size of the population is dwindling as parents are having fewer children, is poverty really less? Statisticians rarely complete their research studies with definite “yes or no” conclusions, because there is always some element of chance or error when making predictions about a population based on a sample. No matter how large the sample, unless all members of a population are surveyed, there will always be a chance the researcher is wrong in their conclusion. Final interpretation is based on the level of statistical significance chosen for the situation, which is the probability of incorrectly rejecting a true null hypothesis. This is usually referred to as the p-value or chance of a Type I error. The p-value indicates the level of statistical significance and the chance of a Type I error. Thus, the lower the p-value, the lower the chance of a Type I error. The lower the Type I error, the higher the level of statistical significance. Thus a p-value of 0.001 has a higher statistical significance than a pvalue of 0.05, because there is a smaller probability of incorrectly rejecting a true null hypothesis.

Chapter 8 Estimating Population Values Learning Objectives: 1.

Estimate a population proportion.

Estimate a population mean.

Decide on the appropriate sample size to estimate a population mean or proportion, given a desired level of accuracy for the estimate.

Use confidence levels to draw appropriate hypothesis-testing conclusions.

Chapter Outline: 8.1 Estimating the Population Proportion 8.2 Estimating the Population Mean 8.3 Selecting the Sample Size 8.4 Confidence Intervals and Hypothesis Tests

Overview: Chapter 8 covers estimation with confidence intervals. It then goes on to discuss the problem of deciding on sample size for estimation. The last section clearly establishes the link between confidence intervals and hypothesis testing. This chapter uses the image of trying to throw a horseshoe around a population parameter, in the hope of making it clear that it’s the horseshoe (confidence interval) that changes position from throw to throw (estimation to estimation), not the population parameter. This should help prevent the common error of referring to the estimated population parameter as if it had a probability distribution (e.g., “We have 90% confidence the mean is between 110 and 120”). Of course, the population parameter does not have a probability distribution, other than it is what it is, with 100% certainty.

Point estimate: A single-number estimate of the population parameter that is based on sample data.

Confidence interval Estimate: A range of numbers of the form (a,b) that is thought to enclose the parameter of interest, with a given level of probability.

This chapter starts with estimation of a population proportion, allowing the students to get familiar with the concepts of estimation using the normal distribution (and the normal table). Estimation procedures rely on sampling, and we will continue to assume simple random sampling. Estimation begins with a sample statistic that corresponds to the population parameter of interest. For example, it makes sense to estimate the population mean, µ, with the sample mean, x . Confidence intervals have a general form as follows: (point estimate) ± (critical value) • (estimated standard error of the sample statistic) Teaching Tip: Investigate the relationship between the width of the confidence interval and the level of confidence. The higher the level of confidence, the wider the confidence interval. It is harder to pinpoint the true population mean with a confidence level of 99% than 90%. Explain this concept to students and show them several examples. Without examples and investigation, the concept may be counterintuitive to students.

8.1 Estimating the Population Proportion In order to estimate the population proportion, we must turn to the sampling distribution of p̂ for this estimation, as long as the following conditions hold: • The sample size must be < 5% of the population if sampling is done without replacement. • n p̂ and n q̂ must both be ≥ 10 for normality of the sampling distribution of p̂ Confidence Interval Estimate for the Population Proportion:

An Excel template for constructing a confidence interval for the population proportion is available on the CD that comes with the text. It is described on page 305.

8.2 Estimating the Population Mean A confidence interval for population mean has the following general form and is valid only if the population is approximately normally distributed:

Confidence Interval Estimate for the Population Mean:

An Excel template for constructing a confidence interval for the population mean is available on the CD that comes with the text. It is described on page 311. The Guide to Technique on page 309 explains the above formula in more detail and the calculation of the degrees of freedom.

8.3 Selecting the Sample Size Deciding on the sample size is a challenge, because you need information that you do not normally have until after you have taken your sample. Setting the Half-Width Interval: (The farthest that x can be from µ)

Formula to Select the Appropriate Sample Size to Estimate the Mean:

The appropriate sample size is the whole number next highest of the calculated result. HW stands for the half-width of the interval. If, for example, you want to estimate the mean to within 10 units, the half-width would be 10. Some estimate for s, the standard deviation, must be available. If the range of the data is known, s can be estimated as range/4. If there is no information about s, a preliminary sample may have to be taken in order to estimate s. The z-score corresponds to the desired level of confidence for the estimate. The Guide to Technique for choosing the appropriate sample size to estimate µ is on page 314. Formula to Select the Appropriate Sample Size to Estimate the Proportion:

The Guide to Technique for choosing the sample size to estimate p is on page 316. HW stands for the half-width of the interval, as above. Some estimate for p̂ must be available. If there is no estimate, use p̂ = 0.50, as this will give the most conservative estimate of the sample size. The appropriate sample size is the whole number next highest to the calculated result. In this chapter, rather than refer to the half-width of the confidence interval as the “error” (which is not, strictly speaking, correct), it is referred to as just what it is—the half-width, with HW as the corresponding notation.

8.4 Confidence Intervals and Hypothesis Tests There is a direct correspondence between a two-tailed test of hypothesis with a significance level of α and an interval estimate with a confidence level of (1 - α).  

If the confidence interval contains the hypothesized value of the population parameter, there is insufficient evidence to reject the null hypothesis. If the confidence interval does not contain the hypothesized value of the population parameter, there is sufficient evidence to reject the null hypothesis.

The section on selecting the sample size to estimate the mean explicitly deals with the difficulties of estimating the standard deviation. An explanation of why the z-score is used in the sample size formula for the mean is provided (something that is too often glossed over, leaving students wondering why they use t-scores in the confidence interval, but z-scores when they are calculating sample size). The point is also made that the result of the sample size formula is not “rounded”, but set at the whole number next highest to the formula result. For example, if the sample size formula result is 69.1, sample size should be set at 70, not 69. The formula indicates that a sample size of 69 is not quite big enough, so a sample size of 70 is required.

Discussion Questions: 1. Why do we have sample size formulas? We have sample size formulas to determine how many members of a population should be selected to ensure that the population is properly represented. When data has been collected, researchers need to determine if they have enough data in order to meet a certain level of confidence (i.e., typically 95% or 98%). Determining an appropriate sample size is a very important issue, because samples that are too large may waste time, resources and money, while samples that are too small may lead to inaccurate results. In most cases, we can easily determine the minimum sample size needed to estimate a process parameter, such as the population mean µ. Thus, sample size formulas are essential for conducting research. If the sample size is too small, the researcher may not have a wide enough range of participants to see desired results at the specified level of confidence, or the results may be dismissed as the result of chance. If the sample size is too large, the costs of the research may require excessive funding for the project.

2. What is a confidence interval? A confidence interval is a special kind of interval estimate of a population parameter. Interval estimates are often preferred, because every sample will have a different sample mean (unless two samples are exactly identical). Instead of a single estimate for the mean, a confidence interval gives a lower and upper limit for the population mean. I tell my students that you can never pin down the actual population mean without sampling every element of the population. If I wanted to know the average time all drivers wait in the Tim Horton’s drive-thru for their coffee, I would have to sample all drivers at every Tim Horton’s restaurant in the world. As a researcher, I may not be able to pinpoint the exact population parameter (the mean waiting time), but I can give a fairly accurate range of times that I can expect the population parameter to fall between. The confidence interval tells us how much uncertainty there is in our estimate of the true population mean. Confidence limits are expressed in terms of a confidence level. Although the choice of confidence level is up to the researcher, in practice 90%, 95%, and 99% intervals are often used, with 95% being the most commonly used. The narrower the interval, the more precise the estimate of the true population mean. The higher the level of confidence, the wider the confidence interval. It is more difficult to pinpoint the true population mean with a confidence level of 99% than 90%. Explain this concept to students, and show them several examples. Use the image of trying to throw a horseshoe around a population parameter, in the hope of making it clear that it is the horseshoe (confidence interval) that changes position from throw to throw (estimation to estimation), not the population parameter. Confidence intervals are an important tool in inferential statistics.

Chapter 9 Making Decisions with Matched-Pairs Samples, Quantitative or Ranked Data Learning Objectives: 1.

To choose and conduct the appropriate hypothesis test to make conclusions about the mean difference in matched populations, on the basis of matched-pairs samples, for normally distributed quantitative data.

To choose and conduct the appropriate hypothesis test to make conclusions about the difference in matched populations, on the basis of matched-pairs samples, for non-normally distributed quantitative data.

To choose and conduct the appropriate hypothesis test to make conclusions about the difference in matched populations, on the basis of matched-pairs samples, for ranked data.

Chapter Outline: 9.1 Matched Pairs, Quantitative Data, Normal Distributed—The t-Test and Confidence Interval of µd 9.2 Matched Pairs, Quantitative Data, Non-Normal Differences–The Wilcoxon Signed Rank Sum Test 9.3 Matched Pairs, Ranked Data–The Sign Test Overview: Chapter 9 covers comparisons of matched-pairs samples. It is somewhat unusual to deal with this case before the more general case of comparisons of independent samples is discussed. However, the benefits of this ordering of topics for students are clear. The test for matched pairs of quantitative data with normal differences is just an extension of the hypothesis test of the mean (covered in Chapter 7). The Sign Test for matched pairs of ranked data is just a variation of the hypothesis test of the proportion (also covered in Chapter 7). It is timely to reinforce the earlier learning, and requires only small incremental additional learning here. The only significantly new material in this chapter is the Wilcoxon Signed Rank Sum Test. Even this is not entirely new, because the idea of dealing with rank sums was introduced earlier in Chapter 3, when the Spearman Rank Correlation Coefficient was discussed. Some introductory courses do not cover the Wilcoxon Signed Rank Sum Test, but such an approach leaves the student with no technique in cases where differences are not normally distributed. The chapter provides some detailed guidance about when the t-test should be used, and when the Wilcoxon Signed Rank Sum Test should be used (see page 353). The Develop Your Skills questions for Sections 9.1 and 9.2 are the same, except for the data sets. This was done purposely, so that students will realize that it is not possible to decide between the ttest and the Wilcoxon Signed Rank Sum Test by simply reading the question. All the techniques in this chapter apply to matched-pairs data. There are two situations when sample data are matched pairs:

1. A matched-pairs experimental study 2. A matched-pairs observational study Matched-Pairs Experimental Study: There is a measurement or count, followed by an action of some kind, followed by a second measurement or count.

Matched-Pairs Observational Study: There is a matching or pairing of observations, designed so that it is easier to decide what caused any observed change between the observations.

Examples of both experimental and observational studies such as given on page 326 of the text should be discussed with the students. In all cases with matched pairs, it is essential that the orders of subtraction (or comparison), be consistent. As well, you will have to think a bit about what the order of subtraction tells you about the alternative hypothesis, so that you can do the correct p-value calculation (or estimation).

9.1 Matched Pairs, Quantitative Data, Normal Distributed—The t-Test and Confidence Interval of µd When the quantitative matched-pairs data have normally distributed differences, a t-test of the mean difference is used to make decisions. The null hypothesis is always that there is no difference, on average, between the two measurements for the matched pairs. The subscript D is used to remind us that we are dealing with the differences between corresponding observations. The null hypothesis H0: µD = 0 (no significant difference) will be compared to one of the following three possible alternatives: • µD > 0 • µD < 0 • µD ≠ 0 A decision about which of the three possible alternative hypotheses to use will depend on the context of the analysis, and the order of subtraction used to arrive at the differences. The t-score is calculated as follows, with the subscript D reminding us that we are looking at a data set of differences:

with nD – 1 degrees of freedom

If you have raw or summary data, you can use the template illustrated on page 330 to make your calculations in Excel (as example 9.1b on page 332 illustrates). With raw sample data, you can also use the Excel data analysis tool “t-Test: Paired Two Sample for Means” (see page 332). Its use is demonstrated on page 332, with data from the example 9.1a. Following directly from the sampling distribution and the discussion of the hypothesis test, a confidence interval estimate can also be constructed for the average difference in population means, as follows: (point estimate) ± (critical value) • (estimated standard error of the sample statistic) In other words: Confidence Interval of µd:

A Guide for Decision Making for matched pairs, quantitative data, normal differences is shown on page 335.

9.2 Matched Pairs, Quantitative Data, Non-Normal Differences—The Wilcoxon Signed Rank Sum Test When the histogram of the differences for quantitative data is not normally distributed (particularly with small sample sizes), the Wilcoxon Signed Rank Sum Test is used to make decisions. The requirement is that the sample histograms be similar in shape and spread (or, equivalently, the histogram of differences is symmetric). The null hypothesis (H0) is that there is no difference in the population locations. The absolute values of differences are ranked from smallest to largest. When the differences are tied, the associated ranks are averaged. The sums of the ranks for the positive and negative differences are calculated. Differences of zero are ignored. The procedure for assigning ranks is described on page 339. When nW, the number of non-zero differences, is at least 25, an approximately normal sampling distribution can be used, with a z-score of:

Some of the add-ins to Excel are introduced in this chapter. The add-in for the Wilcoxon Signed Rank Sum Test calculates the rank sums and sample size for a data set. The student must then go on to use these results to complete the hypothesis test, with the option of doing this manually (also an option for tests or exams) or with an Excel template. This two-step process should allow the student to see that the Excel version of this test is directly parallel to the manual calculations. The add-in for the Wilcoxon Signed Rank Sum Test does not calculate a z-score (or the associated p-value), because this is not appropriate when sample size is small. The approach used here will prevent the student from thoughtlessly relying on computer output to complete the hypothesis test. The Guide to Decision Making is quite clear about when the normal approximation of the sampling distribution is appropriate, and when the test should be done using tables, something that students often have trouble remembering. The Excel add-ins that come with the text (non-parametric tools) contain a tool called a Wilcoxon Signed Rank Sum Test calculations that will calculate the W+ and W- for the Wilcoxon Signed Rank Sum Test. You can then use the worksheet template titled "Making Decisions About Matched Pairs, Quantitative Data, Non-Normal Differences (WSRST) for p-value calculations when the sample size is ≥ 25, see page 347. When nW < 25, the table on page 581 in Appendix 4 should be used to estimate the p-value as example 9.2b on page 350 illustrates. A Guide to Decision Making for matched pairs, quantitative data, non-normal differences is shown on page 352. Teaching Tip: Work through example 9.2a on page 344 carefully with students. Do not skim over the assignment of ranks. Students often find these examples challenging because they require them to assign both the positive and negative rankings. On the whiteboard or overhead, use colours or use two charts to show that the positive ranks match the positive differences. I always have my students sort the data in Excel using a custom sort to separate the data. This is very confusing otherwise. Many students will just try to match the third column for the positive rankings and negative rankings, and this is not correct because we have sorted the rankings. Work through this example step-by-step with your students, and encourage them to use colours if they are completing this question by hand.

9.3 Matched Pairs, Ranked Data—The Sign Test When the data are ranked, the corresponding sample data points cannot be subtracted. However, we can keep track of whether differences are positive or negative. If there is no difference in rankings of the matched pairs, on average, then the number of positive differences should be about equal to the number of negative differences (again, for this test, differences of zero are ignored). The material on the Sign Test is covered both for small samples (when using the binomial distribution directly is appropriate) and larger samples (when the normal approximation to the binomial is appropriate). An Excel add-in is provided to calculate the number of positive and negative differences in matched pairs of ranked data. Again, a two-step process is required, with an Excel template or manual calculations needed to complete the hypothesis test.

This binomial probability distribution, with p = 0.50, can be used to calculate the p-value of the sample result, as example 9.3a on page 355 illustrates. There is no built-in data analysis function in Excel to conduct a Sign Test. The Excel add-ins that come with the text (Non Parametric Tools) contain a tool called Sign Test Calculations that will calculate the numbers of positive and negative differences for a data set. You can then use the worksheet template titled "Making Decisions about Matched Pairs, Ranked Data (Sign Test)" for p-value calculations (see page 359). If you have to do this test by hand, and the sample size is large, you can make a decision using a hypothesis test of p = 0.50, as example 9.3b on page 360 illustrates. A Guide to Decision Making for matched pairs, ranked data, is shown on page 361.

Discussion Questions: 1. Why do we have nonparametric versions of statistical tests that deal with ranked data? Imagine we wanted to compare the number of no-shows at the Toronto Pearson Airport to the Kingston Airport on particular days of the week. We would expect the number of no-shows to be more at the Toronto Airport, because there are more flights and a larger population in the city. For this reason, we could rank the number of no-shows on each day of the week and compare the rankings using an appropriate matched pairs test such as the Wilcoxon Signed Rank Sum test. Nonparametric tests (like Wilcoxon Signed Rank Sum test) are often used in place of their parametric counterparts (the paired t-test of means) when certain assumptions about the underlying population are unknown. For example, when comparing two independent samples, the Wilcoxon Signed Rank Sum test does not assume that the difference between the samples is normally distributed whereas its parametric counterpart, the two-sample paired t-test does. Nonparametric tests are often considered more powerful in detecting population differences when certain assumptions are not satisfied. All tests involving ranked data (i.e.,data that can be put in order) are called nonparametric. Nonparametric tests are used because they handle ranked data. 2. How do you choose which matched-pairs tests to use for your data? You should choose a parametric test if you are sure that your data are sampled from a normal population (at least approximately). You can test this by using Excel to create histograms of your data. You should select a nonparametric test in cases where the outcome is a rank or a score and/or the population is clearly not normal. The following chart can be used as a guide to choosing the appropriate test (the chart is not intended to be exhaustive):

Researcher’s Plan

Measurement (from a Normal Population)

Rank or Score (from NonNormal Population)

Describe one group

Mean, Standard Deviation

Median, interquartile range

Compare one group to a hypothetical value

One-sample t test

Nothing quite comparable

Compare two unpaired groups

Unpaired t test

Compare two paired groups Paired t test

Mann-Whitney test (not discussed in this text) The Wilcoxon Signed Rank Sum

Compare two groups with ranked data

Nothing quite comparable

Matched Pairs, Ranked Data— The Sign Test

Quantify association between two variables

Pearson correlation Spearman correlation

Chapter 10 Making Decisions with Two Independent Samples, Quantitative or Ranked Data Learning Objectives: 1.

To choose and conduct the appropriate hypothesis test to compare two populations, based on independent samples, for normal quantitative data.

To choose and conduct the appropriate hypothesis test to compare two populations, based on independent samples, for non-normal quantitative or ranked data.

Chapter Outline: 10.1 Independent Samples, Normal Quantitative Data—The t-Test and Confidence Interval Estimate of 1-2 10.2 Independent Samples, Non-Normal Quantitative Data or Ranked Data—The Wilcoxon Rank Sum Test Overview: Chapter 10 covers comparison of two independent samples, with quantitative or ranked data. This chapter presents hypothesis tests to make decisions when comparing two independent samples of normal or ranked data. When samples are independent, there is no relationship between observations. Conclusions about differences in populations based on independent samples are not as strong as matched pairs (which are covered in Chapter 9). Despite the weaker conclusions, independent samples are often used because matched-pairs data are more costly or even impossible to obtain. The t-test for independent samples is presented first, for normal quantitative data.

10.1 Independent Samples, Normal Quantitative Data—The t-Test and Confidence Interval Estimate of 1-2 If the data are quantitative, and the histograms of the sample data appear normal, the hypothesis test is a t-test, assuming unequal variances, with H0: 1-2 = 0. The t-test focuses on the unequal-variances case, as the default. The reasons for this are as follows: • The statistical tests for equal variances are sensitive to non-normality, so it is difficult to decide if variances are equal, particularly when sample sizes are small. • Any test of variances should really be independent of the test of means (using the same data set to test both means and variances increases the chances of Type I error). • If variances are mistakenly assumed to be equal when they are not, the results will be unreliable, particularly when sample sizes differ and the smaller sample has the larger variance.

Using the unequal-variances version of the t-test will generally lead to the right decision, even if the variances are in fact equal. The equal-variances version of the t-test is also described in the chapter, for cases when there is good independent evidence of equal variances. The challenge of focusing on the unequal-variances version of the t-test is that the degrees of freedom calculation is a bit onerous for manual calculations. A solution to this difficulty is presented here. Approximating the degrees of freedom with the minimum of (n1-1) and (n2-1) leads to fairly reliable results (although the p-value will tend to be overestimated). As demonstrated in the chapter, the result of the approximation can be quite close to the actual calculations. This approximation for the degrees of freedom is also used for the confidence interval estimate of 1-2. The sampling distribution is approximately a t-distribution. The degrees of freedom are: Degrees of Freedom (df):

The t-score used in calculating p-values is:

The Excel Data Analysis tool for raw sample data, t-Test: Two-Sample Assuming Unequal Variances, is illustrated in the example 10.1a on page 374. These problems can also be done by hand, using minimum (n1 - 1, n2 -1) as degrees of freedom, as illustrated in example 10.1b on page 377. There is an Excel template for problems with summary sample data; it is illustrated on page 378. Teaching Tip: If these problems are done by hand, use the minimum of (n1 – 1) and (n2 -1) for the degrees of freedom as an approximation to the calculation above. This is illustrated in example 10.1b on page 377.

Confidence Interval Estimate for the Difference in Means:

Where tα/2 = t-score and corresponds to the desired level of confidence and has approximately degrees of freedom (df) as calculated and noted above. Excel templates for hypothesis tests and confidence interval estimates for 1-2 are introduced in this chapter. Note that while hypothesis tests described in the chapter are restricted to those with H0: 12 = 0, the Excel template can also handle cases where the difference in means is hypothesized to be non-zero. An Excel template example is illustrated on page 382. The Guide to Decision Making for independent samples, normal quantitative data, is on page 380.

10.2 Independent Samples, Non-Normal Quantitative Data or Ranked Data—The Wilcoxon Rank Sum Test If the data are quantitative, sample sizes are small, and histograms are non-normal, you should use the Wilcoxon Rank Sum Test. This requires that the histograms be similar in shape and spread (if not, the test may indicate only that the population distributions are different in some way: location, shape, or spread, and this may not be all that helpful). The null hypothesis is that the population locations are the same. If the data are ranked, then the Wilcoxon Rank Sum Test is also appropriate, with the same requirements. The Wilcoxon Rank Sum Test (WRST) is presented, for cases where quantitative data are nonnormal, or when independent samples of ranked data are compared. At this point, it should be fairly easy for students to deal with the rank sums, as they have already worked with them in the previous chapter. When the sample sizes are least 10 and the sampling distribution is approximately normal, then the z-score used to calculate the p-values is:

An Excel add-in which provides rank-sum calculations is provided. It will allow you to calculate W1 andW2 from a data set. As well, there is an Excel template to allow you to calculate the p-values. These are both illustrated in example 10.2a on page 388. An Excel template for the associated hypothesis test also exists. Advice is provided about when to use the t-test and when to use the WRST (see page 389).

The Guide to Decision Making for independent samples, non-normal quantitative or ranked data, is on page 391. Teaching Tip: Focus on meeting the underlying conditions necessary when choosing an appropriate statistical test. Make use of the Excel templates given in this chapter.

Discussion Questions: 1. Why is it important that any test of variances should be independent of the test of means (i.e., when determining whether to use a t-test assuming equal variances or a t-test assuming unequal variances)? Remember, for a hypothesis test, the significance level is the cut-off for deciding whether a sample result is so unusual that we have convincing evidence against the null hypothesis. However, there is always the chance that the null hypothesis is true, and we have actually observed a highly unusual sample. If we reject the null hypothesis in such a case, we are committing a Type I error. The significance level is the upper limit on the Type I error. But it is only the upper limit if we conduct just one test. If we conduct another test with the same data, we are increasing the possibility of a Type I error. Skuce describes it this way, “Suppose you are skating across a frozen lake, and there is a 5% chance that you will fall through the ice (an unusual event). If you skate across the lake only once, then it is unlikely that you will fall through the ice. But if you keep skating back and forth across the lake, your chances of falling in (the unusual event) will increase.” Doing repeated hypothesis tests on the same data set is like skating across the lake more than once. There is an increased chance of actually observing an unusual sample, and rejecting the null hypothesis when we should not. So, we cannot safely use the same data set to test for equality of variances and also equality of means. Testing the variances thus requires additional sampling, before you sample to make a decision about the means. It is recommended that you always use the unequal variances version of the t-test to compare population means with independent normally distributed samples, unless you have strong independent evidence that the variances are the same. 2. How do you decide whether to use a Wilcoxon Rank Sum Test (WRST) or t-Test of 1-2 ? Skuce identifies the temptation for students to incorrectly use the WRST whether the data are normal are not, since normal data fit the requirements for the WRST (i.e., that the distributions be similar in shape and spread). Students should not succumb to this temptation. The t-test is preferred to the Wilcoxon Rank Sum Test, because the t-test is more powerful than the WRST when the data are normal, since that it is better at detecting false hypotheses. Researchers should always use the t-test if they can. The determination of how normal is normal enough depends on sample size. The t-test works well, particularly when sample sizes are equal and sample histograms are similar, even if they are somewhat non-normal and skewed. The larger the sample sizes are, the more reliable the t-test will be. As always, one should be cautious about using a t-test when there are outliers in the data. Choosing an appropriate statistical test requires careful planning, care, and thought to ensure underlying conditions are met.

Chapter 11 Making Decisions with Three or More Samples, Quantitative Data–Analysis of Variance (ANOVA) Learning Objectives: 1.

To check the required conditions for the analysis of variance (ANOVA).

To choose and conduct the appropriate hypothesis test to compare the means of three or more populations, based on independent samples, for normally distributed quantitative data.

To make appropriate multiple comparisons to decide which means differ when a test of hypothesis indicates that at least one of the means of three or more populations is different from the others.

Chapter Outline: 11.1 Checking Conditions for One-Way Analysis of Variance (ANOVA) 11.2 The Hypothesis Test For Independent Samples, Normal Quantitative Data—One-Way Analysis of Variance (ANOVA) 11.3 Making Multiple Comparisons To Decide Which Means Differ—The Tukey-Kramer Procedure 11.4 A Brief Introduction to Two-Factor ANOVA Overview: In Chapter 11, Skuce describes hypothesis tests for comparing the means of three or more populations of quantitative data. She uses a conceptual approach to performing the calculations, rather than the computational shortcut formulas. Making decisions with three or more sample means is one of the most powerful techniques in Statistics. There are many situations in which multiple comparisons of this type are made. Suppose, for example, that a college wants to compare the annual salaries of graduates of the four different streams of its Business diploma program, five years after graduation. The college could randomly select graduates from each of the streams (Marketing, Accounting, Human Resources, and General Business), and with appropriate reassurances about maintaining confidentiality, collect the data. Analysis of Variance (ANOVA) would be the statistical technique used to determine if a statistically significant difference existed between the mean annual salaries of the four different streams of its Business Program. In the introduction to the chapter, Skuce lays the foundation for the ANOVA test. She emphasizes why repeated t-tests would lead to an increased chance of a Type I error, and therefore ANOVA must be used. She introduces the main examples in the chapter, and defines terminology essential to understanding the ANOVA process. ANOVA is the basis of more advanced comparison techniques with multiple factors and levels. For example, using the example above, we could further break down the different streams by gender (adding another factor to the analysis in addition to the program stream).

Factor: An explanatory characteristic that might distinguish one group or population from another.

Level: A value or setting that the factor can take.

Response variable: A quantitative variable that is being measured or observed. We are interested in how this variable responds to the different levels of the factor.

Teaching Tip: I tell my students that ANOVA is the basis for more advanced statistical techniques. I verbally tell my students about ANCOVA, which is analysis of variance with a covariate. Imagine you were comparing average grades of three classes of math students, and you knew students in one class had better math backgrounds entering the course. This knowledge could be used as a covariate. I define in plain English, the terms MANOVA and MANCOVA, which are multiple analysis of variance and multiple analysis of variance with a covariate respectively. These tests handle the analysis of multiple factors.

11.1 Checking Conditions for One-Way Analysis of Variance (ANOVA) As with other chapters in the textbook, Skuce emphasizes that there are required conditions that must be met before conducting an ANOVA test. Teaching Tip: I always tell students that it is easy to perform the calculations behind statistical tests. I could tally the number of males and the number of females in the class. I could then average these two numbers. Is there such a thing as an average gender? No, even with simple tests, students must ensure that the required conditions are met.

Software programs like Microsoft Excel make performing the ANOVA test simple, but the following conditions must be met first: 1. The data points are independent and randomly selected from each population. 2. Each population is normally distributed. 3. The populations all have the same variability. In particular, the variances are equal. It is important to select a random sample of independent observations. The normality of the populations must be established by examining histograms of the sample data. As in past chapters, we can create histograms in Excel and visually examine them to establish normality. Example 11.1 on page 405 checks the conditions for a one-factor ANOVA.

11.2 The Hypothesis Test For Independent Samples, Normal Quantitative Data—One-Way Analysis of Variance (ANOVA) Section 11.2 lays the conceptual framework for performing the ANOVA test. Namely, the hypothesis test: H0=µ1= µ2= ··· µk H1=At least one µ differs from the others. We proceed by comparing the between-sample variation (associated with different levels of the factor) to the within-sample variation (random variation). If the population means are actually equal, then the between-sample variation will be about equal to the within-sample variation. If the means are not equal, then the between-sample variation will be significantly greater than the within-sample variation. Skuce walks students through an example calculation, which helps them conceptually understand the ANOVA test. As mentioned early in the introduction to the teacher’s guide, Skuce uses conceptual formulas, rather than computation formulas to aid student understanding of the procedure. The first calculation is the within-sample variation, which is the measure of random variability in the data. Calculation of SSwithin: In general, the calculation of SSwithin for k samples is as follows: SSwithin = SS1+ SS2 + SS3 + ···+ SSk

Next, students need to calculate the sum of squares for the between-sample variation. They first have to calculate an overall mean for the entire data set. In simple terms, this is calculated by averaging all of the data points from the three samples at once. Overall Mean:

The overall mean is our best estimate of the true means of the populations, assuming they are all equal. We focus on the deviations of the individual sample means from this overall mean to tell us something about how much the sample means differ from the overall mean. As usual, deviations are squared. Calculation of SSbetween: In general, the calculation of SSbetween for k samples is as follows:

where x denotes the overall mean of the data set, and xi is the mean of sample k. By this point in the Skuce textbook, students should conceptually recognize the nature of this calculation. Before proceeding with the explanation, Skuce notes the relationship between SSwithin, SSbetween, and the total sum of squares, which we will call SStotal. The relationship is: SSwithin + SSbetween = SStotal. It is also true that:

Once these calculations are complete, we then compute the mean square within groups. MSwithin is an estimate of the variation in analysis of variance. It is used in the denominator of the F statistic.

Calculation of MSwithin: The mean square for within-sample variation is calculated as:

where nT is the total number of observations in the entire data set, and k is the number of samples (and also the number of levels of the factor). As usual,

Next, we compute the mean square between groups. MSbetween is an estimate of the variation in analysis of variance. It is used in the numerator of the F statistic. Calculation of MSbetween: The mean square for between-sample variation is calculated as:

The usual approach is to focus on the ratio of the two mean squares. The F distribution is used to assess the ratio between MSwithin and MSbetween. The calculation of the degrees of freedom is shown below.

The Sampling Distribution of Samples:

with Two or More Independent

follows the F distribution, with (k – 1), The sampling distribution of (nT – k) degrees of freedom, where nT is the total number of observations in the data set, and k is the number of samples (and levels of the factor). The notation for the F distribution is:

Students will benefit from examining the F distributions on page 413 to aid in understanding why the answer depends on the degrees of freedom (and the sample sizes and number of levels of each factor).

Students will be relieved to know that we do not normally do all of the arithmetic by hand. Excel has a built-in Data Analysis tool that supports one-way ANOVA. It can be used with the sample data. For single factor ANOVA, Excel’s Anova: Single Factor function in the Data Analysis ToolPack is described on page 414. Example 11.2 on page 416 walks students through a one-factor ANOVA using Excel. In case students are doing the calculations manually, they can find instructions on how to use an Ftable of critical values on page 417. This is not the preferred approach, because it is impossible to give a comprehensive table of critical values, because there are so many possible F distributions. The Guide to Decision Making for a one-way ANOVA is given on page 419.

Guide to Decision Making: Three or More Independent Samples, Normal Quantitative Data—OneWay ANOVA to Decide About the Equality of Population Means When: • there are normal quantitative data, k independent samples. • trying to make a decision about whether k population means differ, on the basis of the sample means. • trying to determine whether different levels of a factor are causing variation in a response variable. • variances are equal. Steps: 1.

Specify H0, the null hypothesis, which will be H0=µ1= µ2= ··· µk

Specify H1, the alternative hypothesis, which will be H1=At least one µ differs from the others.

Determine or identify α, the significance level.

Collect or identify the sample data. Identify or calculate • the total number of observations, nT, and the number of observations in each sample, ni • the number of populations (levels of the factor), k • • • •

• Check for normality of populations with histograms of the sample data. ANOVA can be used with somewhat skewed data, particularly when sample sizes are large. It is best if sample sizes are equal, or nearly equal. Check for equality of variances. If the ratio of the largest sample variance to the smallest is less than 4, proceed. Be cautious if sample sizes differ, particularly if the smallest sample has the largest variance. If the samples appear to be normally distributed, with approximately equal variances, calculate the appropriate F statistic, using the following formula:

Use the F distribution with (k – 1), (nT – k) degrees of freedom to calculate (or approximate, if using tables) the appropriate p-value for the hypothesis test. In all cases, the p-value will be of the form P(F ≥calculated F). If p-value ≤ α, reject H0 and conclude that there is sufficient evidence to decide in favour of H1. If p-value > α, fail to reject H0 and conclude that there is insufficient evidence to decide in favour of H1. State your conclusions in language appropriate to the problem.

11.3 Making Multiple Comparisons To Decide Which Means Differ—The Tukey-Kramer Procedure The one-way ANOVA test described in Section 11.2 is the first step in comparing many population means. If the result of the ANOVA test is failure to reject the null hypothesis, then we simply conclude that there is not enough evidence to suggest the population means differ. But if we reject the null hypothesis, we have evidence to suggest that at least one of the population means differs from the others, and we must do some further analysis to find out which of the means differs from the others. This further analysis comes in the form of the Tukey-Kramer procedure which helps us decide which of the means differ from each other. Section 11.3 discusses the construction, of confidence intervals for the difference in the population means. For example, we can construct the 95% confidence interval, so that 95% of the time, all of the confidence intervals will contain the true difference in the means being compared.

The formula for the Tukey-Kramer confidence interval for the difference in the mean of population i and the mean of population j (µi- µj) is as follows:

When the samples are the same size (n1 = n2= n), this formula simplifies to:

If the confidence interval estimate does not include zero, then it appears that the two means being compared are different. If the confidence interval estimate includes zero, then it does not appear that the two means being compared are different. Example 11.3 on page 424 uses the Tukey-Kramer approach to find out which means differ. These calculations can be done with the help of the worksheet template called “Tukey-Kramer CI,” available in the workbook called “Templates”. Explanations of using the Excel template for finding the Tukey-Kramer confidence interval are given on page 426.

11.4 A Brief Introduction to Two-Factor ANOVA The two-way analysis of variance is an extension to the one-way analysis of variance. There are two independent variables (hence the name two-way). The two independent variables in a two-way ANOVA are called factors. The premise is that there are two variables, factors, which affect the dependent variable. Each factor will have two or more levels within it, and the degrees of freedom for each factor is one less than the number of levels. The treatment groups are formed by taking all possible combinations of the two factors. For example, if the first factor has 3 levels, and the second factor has 4 levels, then there will be 3x4=12 different treatment groups. Skuce provides a brief introduction to two-factor ANOVA, but gives no calculations or formulas. She gives some examples of situations where a two-factor ANOVA or more advanced statistical techniques are needed.

Go to MyStatLab at www.mathxl.com. Encourage students to use MyStatLab. Students can practise the exercises indicated with red as often as they want, and guided solutions will help them find answers step by step. They’ll find a personalized study plan available to them too! ANOVA can be a difficult topic to comprehend. MyStatLab can help!

Discussion Questions: 1.

Why do we need ANOVA? In statistics, we often want to know if the means of two populations are equal. For example, do men and women earn equal wages on average for performing the same job? This can be found using a two-sample t-test for the equality of means. The problem with that test is that it cannot handle more than two populations. What if we want to know whether people from western Canada, eastern Canada, and the Maritimes earn the same wages on average? To determine this, we need to use an analysis of variance (ANOVA). Variance is a measure of dispersion, not central tendency (like the mean, median, and mode). We must use ANOVA to analyze the variance in order to test whether the means of three or more groups are equal. Sample means can differ for two reasons. The first reason is due to random sampling error. Multiple sample means will never be exactly equal, even if the groups really do have the same population means. There is always natural variation. If the sample means differ only because of sampling error, we would expect those sample means to be very similar. If they are not very similar, then we would most likley conclude that the populations means really probably are different. Therefore, the variance in the sample means provides a method of testing whether the sample means are similar enough or not. If the variance between the groups is relatively small, then we can conclude that the sample means are equal. If the variance between the groups is large, we will conclude they are not equal.

Why do we use the Tukey-Kramer procedure? The Tukey-Kramer procedure is an example of the use of a post-hoc test after the Analysis of Variance (ANOVA). Post-hoc tests are designed for situations in which the researcher has already obtained a statistically significant difference in the means of the three or more groups, and additional exploration of the differences among means is required to provide specific information on which means are significantly different from each other. The ANOVA test only tells the researcher that there is a difference among the population means of the three or more groups. Post-hoc tests are required to identify which means differ from one another.

Chapter 12 Making Decisions with Two or More Samples, Qualitative Data Learning Objectives: 1.

To choose and conduct the appropriate hypothesis test to compare two population proportions.

To choose and conduct the appropriate hypothesis test to compare proportions in one population with a desired distribution.

To choose and conduct the appropriate hypothesis test to compare proportions across many populations, and draw a conclusion about the independence of population characteristic.

Chapter Outline: 12.1 Comparing Two Proportions—z-test and confidence interval of p1-p2 12.2 χ2 Goodness-Of-Fit Tests 12.3 Comparing Many Population Proportions or Testing Independence— χ2 Test of A Contingency Table Overview: Chapter 12 discusses making decisions with two or more samples, with qualitative data. The chapter starts with a discussion of comparisons of two population proportions, and then introduces chisquared analysis. The techniques in this chapter apply when you have two or more samples of qualitative data. In some cases, the focus is on proportions, and in others, the focus is on counts. 12.1 Comparing Two Proportions—z-test and confidence interval of p1-p2 The hypothesis test for comparing two proportions often assumes equal proportions: H0: p1 – p2 = 0 In such a case, the sample data is pooled to get an estimate of the population proportion:

This sample size must be large enough that no continuity correction is required, and the sample distribution is approximately normal:

• Where n1 and n2 are fairly large. • As sampling is done without replacement, the sample should be < 5% of the population.

Assuming the null hypothesis and conditions are met, a z-score will be determined as follows:

This is the general formula as well as the special case where H0 = 0

Excel Data Analysis Histogram tool can be used to organize coded data to get frequencies as in example 12.1a which is illustrated on page 441. There is an Excel template for making decisions about two population proportions (see page 443). It is also possible to make a decision about whether two population proportions differ by some fixed amount, as example 12.1b on page 443 illustrates. The Guide to Decision Making for comparing two proportions is on page 445 of the text. A confidence interval estimate for the difference between the two proportions p1 – p2 can be constructed using the following formula which is valid only if the normality conditions are met: Confidence interval estimate for p1 – p2:

An Excel template is available for finding a confidence interval estimate for the difference in the two populations. Refer to page 447 of the text. 12.2 χ2 Goodness-Of-Fit Tests It is possible to test whether sample data conform to hypothesize distribution across a number of categories. The technique is the chi-squared goodness-of-fit test, which is based on comparison of observed sample frequencies to expected frequencies.

The Test Statistic:

The null hypothesis is always that the population is as claimed or desired, and the alternative hypothesis is that the population is not as claimed or desired. The p-value is always calculated as P (χ2 ≥ the calculated sample test statistic). The related chi-squared distribution has k – 1 degrees of freedom (k is the number of categories in the data).

Chi-Square Probability Distribution using 5 Degrees of freedom and having an Area of Rejection based on a 5% Level of Significance

Chi-Square Distribution Examples for Various Degrees of Freedom A requirement for using the chi-squared sampling distribution is that all expected frequencies should be at least 5. If they are not, you must combine categories in some logical way before proceeding. See example 12.2b on page 453 for an illustration. For goodness-of-fit tests, Excel’s CHITEST function is described. For the more general case of comparing many population proportions or testing independence, an Excel add-in is provided. Unlike the other add-ins with this book which calculate p-values, this add-in alerts the user to the number of expected values less than five. This should prevent the student from drawing an erroneous conclusion.

Calculation of expected values for a goodness-of-fit test is fairly straightforward in Excel, using formulas. The CHITEST function of Excel reports from the p-value of the chi-squared hypothesis test, based on observed and expected frequencies. The use of the CHITEST function is described on page 456. The correspondence between the z-test of proportions and the chi-squared test of a contingency table is described, with a specific example (with calculations) to illustrate. The Guide to Decision Making for a goodness-of-fit test is on page 457 of the text. 12.3 Comparing Many Population Proportions or Testing Independence— χ2 Test of A Contingency Table The chi-squared distribution can be used to compare multiple populations across multiple categories (see example 12.3a on page 460). It can also be used to test the independence of the categories or characteristics of the sample data (see example 12.3b on page 465). The test statistic:

Where expected values are calculated as follows:

As with the goodness-of-fit test, the χ2 test statistic should be used only if all expected frequencies are at least 5. If they are not, categories should be combined. The degrees of freedom for the chisquared distribution are: (r - 1) (c -1), where r is the number of rows in the contingency table, and c is the number of columns (not including totals). An Excel add-in (see page 462) is available to do calculations of expected value, the χ2 statistic, and the p-value, as illustrated in example 12.3b on page 465. The Guide to Decision Making for contingency table tests is on page 466. Teaching Tip: Make sure to work through Example 12.3a on page 460 to show the computations of expected value. Students need actual practice with this calculation. Do not just show it to them on an overhead. It is much clearer if students actually work through the sample. Ask them to create Exhibit 12.22 on page 461.

Why is ANOVA inappropriate for decisions with two or more samples for qualitative data? ANOVA is inappropriate to use to make decisions with two or more samples of qualitative data, because we are not dealing with means and variances with qualitative data. The chi-square statistic is a nonparametric statistical technique used to determine if a distribution of observed frequencies differs from the theoretical expected frequencies. Chi-square statistics use nominal (categorical)-level or ordinal-level data; thus, instead of using means and variances, this test uses frequencies. We cannot average frequencies – for example it does not make sense to find an average race or average gender. Therefore, more advanced techniques like ANOVA are inappropriate for qualitative data.

What are the two types of goodness-of-fit tests? The two types of chi-square test are:  

The Chi-square test for goodness-of-fit compares the expected and observed values to determine how well the researcher's predictions fit the data. The Chi-square test for independence which compares two sets of categories to determine whether the two groups are distributed differently among the categories.

Both of these tests are nonparametric statistical techniques which look at frequencies rather than means.

Chapter 13 Analyzing Linear Relationships, Two Quantitative Variables Learning Objectives: 1.

To create a scatter diagram and estimate the least-squares regression line with a sample of x-y quantitative data.

To check the conditions required for the use of the regression model in hypothesis testing and prediction.

To conduct the hypothesis test to determine if there is evidence of a significantly linear relationship between x and y.

To produce (with Excel) and interpret the coefficient of determination for the regression relationship.

To use the regression relationship, if appropriate, to make predictions about an individual y-value and an average y-value, given a particular x-value.

Chapter Outline: 13.1 Creating a Graph and Determining the Relationship—Simple Linear Regression 13.2 Assessing the Model 13.3 Hypothesis Test about the Regression Relationship 13.4 How Good is the Regression? 13.5 Making Predictions Overview: Chapter 13 provides a very thorough introduction to the analysis of the relationship between two quantitative variables. As in other chapters, an emphasis is placed on checking required conditions before proceeding. Specific instructions and illustrations of how to check conditions are provided. One example (the Woodbon data) is used to illustrate the material throughout the chapter. A number of other data sets are used repeatedly, in the Develop Your Skills (DYS) questions, so that students get the experience of doing a full analysis on each data set. Excel is used to produce the graphs and to do the calculations. There is really no reason to ask students to do these calculations manually, as they are tedious and time-consuming. This material can be tested with Excel printouts, with a focus on understanding and interpreting the output. Throughout the chapter, the necessity to use good judgment to interpret results, and to proceed thoughtfully, is emphasized. Students can become mesmerized by all the output

produced in regression analysis, and they will need some guidance to keep their focus on whether the results are sensible. The Excel templates for prediction and confidence intervals explicitly remind students to check the required conditions. The templates are designed to work with the output of Excel’s Regression tool.

13.1 Creating a Graph and Determining the Relationship—Simple Linear Regression The first step in analyzing a relationship between two variables is to create a scatter diagram, with the dependent (response) variable plotted on the y-axis and the independent (explanatory) variable plotted on the x-axis. The equation of a straight line that best fits the points on the scatter diagram is of the form:

The coefficients b0 and b1 result from minimizing the sum of the squared residuals for the line. A residual is a difference between the observed value of y for a given x, and the predicted value of y for that x. The coefficient of the least-squares line can be determined with Excel, either with the trend line tool (see page 483) comment or with the regression tool of data analysis (see page 485).

13.2 Assessing the Model Theoretically, there is a normal distribution of possible y-values for every x. The population relationship we are interested in is the average y for every x, as follows:

µy = β0 + β1x We cannot reliably make predictions with the regression equations, or conduct a hypothesis test to see if there is a significant relationship between the x-variable and y-variable, unless certain conditions are met (these are summarized in the box on page 491). We check these conditions by focusing on the residuals in the sample data set. It is easy to check the residuals with Excel. Exhibit 13.16 on page 492 describes the process. See the box on page 491 for checking the requirements for the linear regression model.

12.3 Hypothesis Test about the Regression Relationship Once we have assured ourselves that the required conditions are met, we can test to see if there is a significant linear relationship between the x-variable and y-variable. This is done with a test of the slope of the line, β1. The null hypothesis of no relationship (β1 = 0) is tested against one of three possible alternatives:

• β1 ≠ 0 (there is some relationship between x and y) • β1 > 0 (there is a positive relationship between x and y) • β1 < 0 (there is a negative relationship between x and y) The output of Excel Data Analysis tool called Regression provides the t-score and p-value for the two-tailed version of this test. An illustration of how to read the output is on page 508. A Guide to Decision Making for testing the slope of the regression line for evidence of a linear relationship is shown on page 509.

13.4 How Good is the Regression? The Pearson r, the correlation coefficient, can be used to measure the degree of linear association between the x-variable and y-variable as discussed in Chapter 3. Another related measure is the coefficient of determination, or R2. The coefficient of determination measures the percentage of variation in the y-variable that is explained by changes in the x-variable. The coefficient of determination is produced in Excel's regression output. Instructions are noted in the text. It is important to recognize that "explained by" is not the same as "caused by". Even though the R2 may be high, the true causal relationship can only be judged on the basis of an understanding of the context of the data. Teaching Tip: Make sure students understand the difference between the regression coefficient R and the coefficient of determination R2. I find students often mix up both, and incorrectly interpret one as the other. Adding a Trendline to a scatterplot in Excel displays the coefficient of determination, R2, not the R value. Students need to take time and care to make sure they are interpreting the right variable.

13.5 Making Predictions Two types of predictions can be made, if the requirements are met: • A prediction interval predicts a particular value of y for a given value of x. • A confidence interval predicts the average y for a given value of x.

Formula for a prediction interval for y, given a particular x is:

where the t-distribution has n-2 degrees of freedom.

Formula for a confidence interval for µy, given a particular x is:

There is no built-in data analysis function in Excel to calculate these interval estimates. Therefore, Pearson has created an Excel add-in that is available on the CD that comes with the text, called “Multiple Regression Tools” to do the calculations for you. Within this tool, there is an option for “Prediction and Confidence Interval Calculations.” See the Excel instructions in the text on page 514. Always remember that it is not legitimate to make predictions outside the range of the sample data. Go to MyStatLab at www.mathxl.com. MyStatLab is an online homework, tutorial, and personalized assessment system that accompanies this Pearson Education textbook. Chapter 13 introduces new formulas and calculations. Encourage students to begin using MyStatLab on a regular basis to assist in their homework.

Discussion Questions: 1. What do you need to remember when making predictions with regression models? You should remember that it is not legitimate to make predictions with a regression model for x-values outside the range of the sample data. Even a powerful regression relationship with a high coefficient of determination and low standard error should not be relied on outside the range of the sample data. If it is used, the error that arises when the regression relationship is used for an x-value outside this range can be quite large. 2. Why is residual analysis important? There are requirements for residuals when performing predictions or hypothesis tests about the linear regression relationship. These are: 1.

For any given value of x, there are many possible values of y, and therefore, many possible values of the residual, or error term. The distribution of the є -values for any given x must be normal. This means that the actual y-values will be scattered in a normal fashion around the regression line.

The normal distributions of є must have a mean of 0. This means that the actual yvalues will have an expected value, or mean, that is equal to the predicted y from the regression line.

The standard deviation of є, which we will refer to as σє, is the same for every value of x. This means that the actual y-values will be scattered about the same distance away from the regression line, all along the line.

The є -values for different data points are not related to each other (another way to say this is that the є -values for different data points are independent).

Without ensuring these requirements are met, the linear regression analysis may be invalid. The residual plot should be checked to see if the residuals have the same amount of variation for all values of x (the third requirement above says that the residuals all have the same standard deviation). Look for a residual plot showing a horizontal band, centred vertically on zero. This is a judgment call for the researcher. Another requirement (#4) of the data is that the error terms, or residuals, are independent of one another. It can be difficult to check the independence of the error terms, since this involves imagining all of the ways in which they could be related. One of the most common sources of non-independence among the residuals is time. When you are working with time-series data, you should plot the residuals against time to see if any pattern emerges. Thus, residual analysis is a critical component of regression analysis.

100

Chapter 14 Analyzing Linear Relationships, Two or More Variables Learning Objectives: 1.

To estimate the linear relationship between a quantitative response variable and one or more explanatory variables.

To check the conditions required for use of the regression model in hypothesis testing and prediction.

To assess the regression relationship, using appropriate hypothesis tests and a coefficient of determination.

To make predictions using the regression relationship.

To understand the considerations involved in choosing the “best” regression model, and the challenges presented by multicollinearity.

To use indicator variables to model qualitative explanatory variables.

Chapter Outline: 14.1 Determining The Relationship—Multiple Linear Regression 14.2 Checking The Required Conditions 14.3 How Good Is The Regression? 14.4 Making Predictions 14.5 Selecting The Appropriate Explanatory Variables 14.6 Using Indicator Variables In Multiple Regression 14.7 More Advanced Modelling

Overview: The general purpose of multiple regression is to analyze the relationship between several independent or predictor variables and a dependent or criterion variable. For example, a real estate agent might record for each listing: the size of the house (in square feet), the number of bedrooms, the average income of families in the neighbourhood, the proximity to schools, whether renovations are required, etc. Once this information has been obtained for various houses, it would be interesting to see whether and how these variables relate to the selling price of the house. We might learn that the number of bedrooms is a better predictor of the selling price for a house in a particular neighborhood, than whether renovations are required. Researchers will likely examine a number of possible explanatory variables, with the aim of developing a model that is economical (that is, has reasonable data requirements) and works well (that is, makes useful predictions). This chapter walks us through the process.

101

14.1 Determining The Relationship—Multiple Linear Regression Section 14.1 builds on the discussion in Chapter 13, to extend the mathematical model to include more than one explanatory variable. As in prior chapters, students use Excel to create graphs to examine the relationships between the response variable and the explanatory variables. Excel is also used to determine the relationship between the response variable and the explanatory (predictor) variables. Analysis begins by adding all the new explanatory variables to create a new multiple regression model. The Data Analysis addin has a “Regression” tool to analyze the model. Instructions can be found on page 529.

14.2 Checking The Required Conditions This section extends the theoretical model from the last chapter to include more explanatory variables, revisiting the discussion about least-squares models. As before, Excel is used to check the required conditions for the regression model by examining the residuals. The population relationship we are trying to model is:

Skuce develops this formula from the simple linear regression model. Skuce emphasizes that the residuals must be examined and the conditions must be met before accepting any regression model. Requirements for Predictions or Hypothesis Tests About the Multiple Regression Relationship: 1.

For any specific combination of the x-values, there are many possible values of y and the residual (or “error term”) є. The distribution of these є -values must be normal for any specific combination of xvalues. This means that the actual y-values will be normally distributed around the predicted y-values from the regression relationship, for every specific combination of x-values.

These normal distributions of є -values must have a mean of zero. The actual y-values will have expected values, or means, that are equal to the predicted y-values from the regression relationship.

The standard deviation of the є -values, which we refer to as σє is the same for every combination of x-values. The normal distributions of actual y-values around the predicted y-values from the regression relationship will have the same variability for every specific combination of x-values.

The є-values for different combinations of the x-values are not related to each other. The value of the error term є is statistically independent of any other value of є.

102

In the Excel Regression dialogue box, you should tick Residuals, Standardized Residuals, and Residual Plots. As in Chapter 13, students should create a histogram of the residuals, and should plot the residuals against time if they have time-series data. The rest of this section discusses independence of error terms, the normality of residuals, and outliers and influential observations. Skuce also discusses what can be done if the required conditions are not met. Example 14.2 on page 537 walks through checking the conditions listed above for linear multiple regression.

14.3 How Good Is The Regression? This section introduces hypothesis tests about the significance of the overall model, and the individual explanatory variables. It also discusses the measure of the strength of the relationship between the explanatory variables and the response variables, and the adjusted coefficient of determination (adjusted R2). In multiple regression, we test the model as a whole with the following hypothesis test:

The F-test statistic is:

Remind students that they have seen this formula before.

The Sampling Distribution of Models:

in Linear Multiple Regression

follows the F distribution, with The sampling distribution of (k, n –(k+1)) degrees of freedom, where n is the number of observed data points and k is the number of explanatory variables in the model.

Example 14.3A on page 543 conducts a hypothesis test of the significance of the regression model. Fortunately, the Excel output not only calculates the F statistic for the hypothesis test of the regression model, it also calculates the associated p-value. If the hypothesis test of the overall regression model indicates a significant relationship between the response variable and at least one of the explanatory variables, the next step is to determine which of the explanatory variables is significant. The t-test is used to determine this with the following null and alternate claims:

103

The test statistic is the following ratio (with degrees of freedom n –(k+1)), which can be determined by Microsoft Excel:

Example 14.3b on page 545 uses t-tests of individual coefficients in a regression model to examine their significance. This section also introduces students to the adjusted R2 which is defined as:

The adjusted R2 is a modification of R2 that adjusts for the number of terms in a model. R2 always increases when a new term is added to a model, but adjusted R2 increases only if the new term improves the model more than would be expected by chance. It follows from the definition of R2:

Teaching Tip: It must be made clear to students that adding additional variables does not necessarily make a better model. This is what makes understanding the adjusted R2 essential.

14.4 Making Predictions Section 14.4 describes an Excel add-in that you can use to make predictions of average and individual response variables, given specific values of the explanatory variables in the model. An Excel add-in (Multiple Regression Tools) has been created to do these calculations. This add-in was first introduced in Chapter 13 (see page 514). There is a discussion of the tool on page 548. Example 14.4 on page 549 shows how to calculate confidence and prediction intervals with Excel. 14.5 Selecting The Appropriate Explanatory Variables This section looks at ways to assess and deal with a new problem that may arise when there is more than one explanatory variable. This problem is usually referred to as “multicollinearity,” and it occurs when one of the explanatory variables is related to one or more of the other explanatory variables. For example, imagine you are a real estate agent and wish to create a model to predict the selling price of a new home. You decide to add the explanatory variables for the number of bedrooms and the number of bathrooms. You would expect these variables to be related. As the number of bedrooms in the house increases, one would expect the number of bathrooms to increase

104

as well. What we are experiencing is the problem of multicollinearity. Care must be taken when creating multiple regression models. Skuce has developed an Excel add-in called “Multiple Regression Tools Add-In, All Possible Regressions Calculations” to easily create all possible models from a data set. The explanation is on page 552. Example 14.5a on page 554 assesses all possible regression models for the Woodbon sales data which has been discussed extensively in the text. Remember the goals of regression models: Goals of Regression Models: 1.

The model should be easy to use. It should be reasonably easy to acquire data for the model’s explanatory variables.

The model should be reasonable. The coefficients should represent a reasonable cause-and-effect relationship between the response variable and the explanatory variables.

The model should make useful and reliable predictions. Prediction and confidence intervals should be reasonably narrow.

The model should be stable. It should not be significantly affected by small changes in explanatory variable data.

14.6 Using Indicator Variables In Multiple Regression In a multiple regression model, one might want to include a qualitative variable such as gender or has a master`s degree. Indicator variables can be included in regression models by coding the data. For example, we can assign female = 1 and male = 0. Example 14.6 on page 562 discusses the procedure for using two indicator variables to represent three different types of batteries. Indicator variables are not limited to qualitative variables with only two responses (i.e., Yes-No answers).

14.7 More Advanced Modelling At the end of the chapter on page 565 of the text, there is a short discussion of more advanced modelling, and the possible extensions of the concepts presented here. The point is made that simple models (with good predictive abilities) are preferred to complicated models.

Go to MyStatLab at www.mathxl.com. Encourage students to use MyStatLab to prepare for their quizzes, tests, midterms, and final exams. Students can practise the exercises indicated with red as often as they want, and guided solutions will help them find answers step by step. They’ll find a personalized study plan available to them too! Encourage students to use this tool to prepare for all their tests.

105

Discussion Questions:

What is the difference between R-Square and Adjusted R-Square when running a regression analysis? R2 is a test statistic that will give some information about the goodness-of-fit of a model. In regression, the R2 coefficient of determination is a statistical measure of how much variance in the response variable is explained by the input or predictor variables. Adjusted R2 is a modification of R2 that adjusts for the number of explanatory terms in a model. Unlike R2, the adjusted R2 increases only if the new term improves the model more than would be expected by chance. The adjusted R2 will always be less than or equal to R2. Adjusted R2 does not have the same interpretation as R2. Hence, care must be taken in interpreting and reporting this statistic. It is important to note that adjusted R2 is not always better than R2. The adjusted R2 will be more useful only if the R2 is calculated based on a sample, not the entire population. For example, if our unit of analysis is a province, and we have data for all cities in the province, then adjusted R2 will not yield any more useful information than R2.

How can we avoid multicollinearity? We can avoid multicollinearity issues by carefully selecting variables for our multiple regression model. We must guard against multicollinearity by choosing the explanatory variables carefully. If mortgage rates, for instance, are being considered as an explanatory variable in a model about selling prices of houses, it would not make sense to include prime rates or another mortgage rate in the model. Since these variables are likely highly correlated with each other, most of the explanatory power is gained when the first variable is introduced. The second variable will not likely tell us more about the response variable. There are various other methods aimed at identifying collinear variables, if the relationship between them is not immediately obvious. One of these is to create a scatter diagram of the relationship of every explanatory variable with every other explanatory variable. Another method of assessing the correlation between the explanatory variables is to create a correlation matrix for the variables. This is easy to do with Excel, using the correlation tool of Data Analysis add-in. The correlation coefficients tell us something about how the variables are related as pairs. It is also possible that one explanatory variable could be simultaneously related to two other explanatory variables, and neither the scatter diagrams, nor the correlation matrix will reveal this. The scatter diagrams and the correlation matrix will help identify obvious pairwise correlations between variables, but other more difficult to identify sources of multicollinearity may also be present. We must do our best to find and eliminate multiple collinearity from our regression model.

106

Appendix Solutions to Odd-Numbered Exercises Chapter 1 Solutions Develop Your Skills 1.1 1. You would have to collect these data directly from the students, by asking them. This would be difficult and time-consuming, unless you are attending a very small school. You might be able to get a list of all the students attending the school, but privacy protection laws would make this difficult. No matter how much you tried, you would probably find it impossible to locate and interview every single student (some would be absent because of illness or work commitments or because they do not attend class regularly). Some people may refuse to answer your questions. Some people may lie about their music preferences. It would be difficult to solve some of these problems. You might ask for the school's cooperation in contacting students, but it is unlikely they would comply. You could offer some kind of reward for students who participate, but this could be expensive. You could enter participants' names in a contest, with a music-related reward available. None of these approaches could guarantee that you could collect all the data, or that students would accurately report their preferences. One partial solution would be to collect data from a random sample of students, as you will see in the discussion in Section 1.2 of the text. Without a list of all students, it would be difficult to ensure that you had a truly random sample, but this approach is probably more workable than a census (that is, interviewing every student). 3.

Statistics Canada has a CANSIM Table 203-0010, Survey of household spending (SHS), household spending on recreation, by province and territory, annual, which contains information on purchases of bicycles, parts and accessories. There is a U.S. trade publication called "Bicycle Retailer & Industry News", which provides information about the industry. See http://www.bicycleretailer.com/. Access is provided through the Business Source Complete database. Industry Canada provides a STAT-USA report on the bicycle industry in Canada, at http://strategis.ic.gc.ca/epic/internet/inimr-ri.nsf/en/gr105431e.html. Somewhat outdated information is also available at http://www.ic.gc.ca/eic/site/sg-as.nsf/eng/sg03430.html. Canadian Business magazine has a number of articles on the bicycle industry. One of the most recent describes the purchase of the Iron Horse Co. of New York by Dorel Industries (a Montreal firm). http://www.canadianbusiness.com/markets/headline_news/article.jsp?content=b15609913

At least some of the secondary data sources listed in Section 1.1 should help you. If you cannot locate any secondary data, get help from a librarian.

Develop Your Skills 1.2 7. This is a nonstatistical sample, and could be described as a convenience sample. The restaurant presumably has diners on nights other than Friday, and none of these could be selected for the sample. The owner should not rely on the sample data to describe all of the restaurant's diners, although the sample might be useful to test reaction to a new menu item, for example.

107

Follow the instructions for Example 1.2c. The random sample you get will be different, but here is one example of the 10 names selected randomly. AVERY EMILY HARRIET DYLAN TERRY GEORGE JAMES AVA PAIGE JORDAN

MOORE MCCONNELL COOGAN MILES DUNCAN BARTON BARCLAY WORTH EATON BOCK

Develop Your Skills 1.3 11. This is impossible. A price cannot decrease by more than 100% (and a 100% decrease would mean the price was 0). It is likely that the company means that the old price is 125% of the current price. So, for example, if the old price was $250, then the new price would be $200. You can see that 250/200 = 1.25 or 125%. 13. “Jane Woodsman’s average grade has increased from 13.8% last semester to 16.6% this semester.” The provocative language of the initial statement (“astonishing progress”, “substantial 20%”) is inappropriate. As well, the 20% figure, used as it is here, suggests something different from the facts. Jane’s grades did increase by 20%, but this is only 20% of the original grade of 13.8%, so it is not much of an improvement. 15. Yes. There is no distortion in how the data are represented, and the graph is clearly labelled and easy to understand. Develop Your Skills 1.4 17. No. Even if the study was randomized (no information is provided), it would not be legitimate to make this conclusion. While taller men are more likely to be married, we cannot conclude that they are more likely to be married because they are taller. There could be many other factors at work. 19. If you had compared the sales performance of a randomly-selected group of salespeople (not only poor performers), you would be able to come to a stronger conclusion about the diary system’s impact on increased sales. Develop Your Skills 1.5 21.a. In this case, the national manager of quality control probably has a good grasp of statistical approaches. While you should still strive for clarity and simplicity, you can include more of the technical work in the body of the report. Printouts of computer-based analysis would be included in the appendix. b. While human resources professionals probably have some understanding of statistical analysis, they are less likely to understand the details. In this case, you should write your report with a minimum of statistical jargon. The body of your report should contain key results, but the details of your analysis should be saved for the appendix. c. In this case, you can assume no statistical expertise in your readers. While you should still report on how your analysis was conducted, and how you arrived at your conclusions, you probably would not send this part of the report to your customers. The report you send your boss should be easily understandable to everyone. The challenge here will be to make your conclusions easy to understand, while not oversimplifying, or suggesting that your results are stronger than they actually are.

108

23. The average amount of paint in a random sample of 30 cans was 3.012 litres, compared with the target level of 3 litres. This sample mean is within control chart limits, indicating no need to adjust the paint filling line. 25. In fact, some studies have shown that there is a positive relationship between height and income (that is, taller people tend to have higher incomes). However, all such studies must be observational (there is no ethical way to control height!), and so the cause-and-effect conclusion suggested here is not valid. The statement could be rewritten as follows: “A study has shown a strong positive relationship between height and income.” You might even go on to discourage the unsophisticated reader from jumping to conclusions, as follows: “Of course, this should not be interpreted as meaning that greater height guarantees higher income, or that you cannot earn a high income if you are short.” Chapter Review Exercises 1. Collecting data usually leads to a better understanding of the question and a better decision. 3.

Decision-makers may not be statisticians. Statistical analysis is powerful only if it is communicated so that those making the decision can understand the story the data are telling.

11. There are many examples of loyalty programs: Airmiles, President’s Choice Financial rewards, American Express rewards, HBC rewards, PetroCanada’s Petro-Points, Sears Club points, Aeroplan. Enter “loyalty rewards programs” into an Internet search engine, and you will find references to many such programs. 13. The article describes the New Coke story as "the greatest marketing disaster of all time". The research failed to uncover the attachment people felt to the original Coke. A question such as "Would you switch to the New Coke?" might have revealed how loyal customers were to the original Coke. 15. While the title used in the report is accurate, the percentage decrease is relatively small, and the actual number of drivers has increased. The title could easily be misunderstood.

109

17. Of course, because your samples are randomly-selected, your samples cannot be predicted. When the author did this exercise, the sample averages were as shown in the table below. The population average is 65.0. The sample averages ranged from 60.7 to 73.7. Some were quite close to the population average, and the largest difference between a sample average and the true value was 8.7. Average Mark from 10 Randomly-Selected Samples of Size 10 60.7 61.8 65.5 65.6 67.4 69.8 70.3 71.3 73.0 73.7

Average Mark from 10 Randomly-Selected Samples of Size 15 58.6 61.2 62.6 64.0 66.2 66.9 67.2 67.7 68.4 70.8 19. Based on the author's results for exercises 17 and 18 above (yours will be different): the average of the sample averages, when the sample size is 10, is 67.9. The average of the sample averages, when the sample size is 15, is 65.4. The average of the sample means is closer to the true population average value when the sample size is larger. You will investigate this more in Chapter 6.

110

Postal codes are qualitative data.

Develop Your Skills 2.2 7. A frequency distribution and histogram are shown below.

Survey of Drugstore Customers: Customer Incomes 16 14

Number of Customers

12 10 8 6 4 2 0 Customer Income

111

This histogram totally fails at its job of summarizing the accompanying data set. The graph does not have meaningful titles or labels. It completely fails to communicate what it’s about. 2. There are gaps between the bars, which there should not be. 3. It appears that the creator of this graph used bin numbers correctly, but s/he forgot to round them for presentation. The graph should show lower class limits along the x-axis in the proper location, that is, aligned under the left-hand side of each bar. 1.

112

Develop Your Skills 2.3 11. Either a bar graph or a pie chart would be appropriate.

Number of Customers

Survey of Drugstore Customers: Speed of Service Ratings 20 18 16 14 12 10 8 6 4 2 0 Excellent

Good

Fair

Poor

Rating

Survey of Drugstore Customers: Speed of Service Ratings Poor 18%

Excellent 6%

Good 38%

Fair 38%

113

13. Since we want to compare the number of defects by shift, it is appropriate to compare the categories for the number of defects across the horizontal axis. Since each shift produced a different number of items1, it makes sense to use relative frequencies. For each shift, calculate the percentage of items with no defects, one minor defect, more than one minor defect, and then create a bar graph.

Percentage of Total Number of Items Produced

Defects Observed at a Manufacturing Plant, by Shift 100% 8:00 a.m. – 4 p.m.

90% 80% 70%

4:00 p.m. – midnight

60% 50% 40%

midnight – 8:00 a.m.

30% 20%

10% 0% Items With No Apparent Defects

Items With One Minor Defect

Items With More Than One Minor Defect

This too is interesting, and the fact should be included in any accompanying report.

114

15.

Survey of a Random Sample of People Walking Around Kempenfelt Bay 14

Number of People

12 10 8 6 4 2 0 Vanilla

Chocolate Strawberry Maple Chocolate Pralines Walnut Chip

Other

Favourite Flavour of Ice Cream

Survey of a Random Sample of People Walking Around Kempenfelt Bay 14

Number of People

12 10 8 6 4 2 0 Vanilla

Chocolate Chocolate Strawberry Maple Chip Walnut

Other

Pralines

Favourite Flavour of Ice Cream

115

Survey of a Random Sample of People Walking Around Kempenfelt Bay Other 2%

Pralines 0%

Chocolate Chip 21%

Vanilla 28%

Maple Walnut 4%

Strawberry 17%

Chocolate 28%

Develop Your Skills 2.4 17. Whatever your data source (as long as the data are accurate) you should see that over this period, the price of $1US in Canadian dollars was on an increasing trend from January 2000 until the beginning of 2002, with the highest exchange value of 1.599618 (monthly average) in January of 2002. From then until near the end of 2007, the exchange value of the US dollar in terms of Canadian dollars was on a declining trend, reaching 0.968 in November of 2007. The rate then stabilized around par, beginning to increase in the latter part of 2008, ending with a monthly average rate of 1.2343619 in December. The graph below shows the trends.

116

19. Your commentary should describe the data for the company you chose. Here is a checklist to help you: Be sure to note the start and end dates for the data. Comment on at least a couple of specific values in the data set (e.g., the high and low for the period). Keep your language objective and descriptive. Do not leap to any conclusions about why the data might look the way they do. Develop Your Skills 2.5 21.

Monthly Spending on Restaurant Meals

Spending on Restaurant Meals and Income $250 $200 $150 $100

$50 $-‐ $-‐

$1,000

$2,000

$3,000

$4,000

$5,000

Monthly Income

117

23.

Survey of Drugstore Customers Amount of Most Recent Purchase

$45 $40 $35 $30 $25 $20 $15 $10 $5 $-‐ $20,000 $30,000 $40,000 $50,000 $60,000 $70,000 $80,000 Customer Annual Income

There does not appear to be a strong relationship between the customer’s income and the amount of the most recent purchase. Note the scale on the x-axis does not start at zero. If you think about it, this should not come as a surprise. While we might have expected those customers with greater incomes to have higher purchases, this effect is less likely to appear for a single purchase. There may be more of a positive relationship between annual income and total annual drugstore purchases. 25. Exhibit 2.70c is not correct, because the explanatory variable is years of service, and it should be graphed on the x-axis. Exhibit 2.70b is probably not correct, because it depicts a negative relationship, that is, those with more years of service earn lower salaries. Exhibit 2.70a is the only possible choice, as it shows higher salaries associated with longer years of service.

118

Develop Your Skills 2.6 27. The pictograph looks as follows:

These data are qualitative, unranked, and cross-sectional.

1c.

These data are quantitative, discrete, and cross-sectional.

1e.

These are time-series continuous quantitative data.

2a. A double bar graph could show males and females along the x-axis, with two bars above, one for those with fitness club membership, one bar for those without. Alternatively, categories of fitness club membership could show along the x-axis, with bars for males and females above each category. 2c. If the total number of pedestrians is recorded, there are only two data points, the number of people who passed by each location. A graph would not really add much to a simple table displaying these numbers, with a proper title and headings. 2e. It is likely that there is interest in the relationship between sales and advertising. A scatter diagram would be appropriate, with advertising along the x-axis, and sales on the y-axis. 3.

These graphs are meant to be amusing and entertaining. Quirky images and bright colours make them attractive, but they are not good examples of graphs to summarize data.

119

There are two possible graphical displays, a bar chart or a pie chart. Both are shown below. The pie chart has been formatted for black and white printout.

Ratings from a 360 Degree Review for a Trainee 4 3

2 1 Best Possible Performance

Very Good, Very Little Improvement Required

Good

Acceptable

Poor, With Major Improvement Required

0 Worst Possible Performance

Number of Ratings

Ratings from a 360 Degree Review for a Trainee Very Good, Very Little Improvement Required 20%

Good 27%

Best Possible Worst Possible Performance Performance 7% 6% Poor, With Major Improvement Required 27%

Acceptable 13%

120

Whichever graphical display is used, it is apparent that the trainee’s ratings are not consistent. About 67% of raters indicated that the trainee's performance was acceptable or better. However 27% suggested that major improvement was required, and 6% rated the trainee's performance as the worst possible. Certainly, there seems to be a wide range of opinions about this trainee. In this case, since the number of students in each sample is the same, it is appropriate to compare the number of students directly. An appropriate graph is shown below. The graph shows that the B.C. students were much more likely to rate this university as “excellent” than the Ontario students, with Ontario students much more likely to rate it as “poor”. The Ontario and B.C. students have different opinions about this university.

Ratings of a Canadian University by Ontario and BC Students 9

Ontario Student Ratings

Number of Students

BC Student Ratings

6 5 4 3 2

1 0 Excellent

Good

Fair

Poor

121

11. Since the samples are different sizes, relative frequencies must be used to make the comparison.

Percentage of Students in Program

Origins of Students in Two College Programs 60% From Local Area

50% 40%

Not From Local Area

30% 20% 10% 0% Business

Technology

Nursing

122

13. Two histograms, properly set up for comparison, are shown below.

Marks for a Random Sample of Students in Ms. Nice's Statistics Class 12

Number of Marks

10 8 6 4 2 0 Final Grade (%)

Marks for a Random Sample of Students in Mr. Mean's Statistics Class 12

Number of Marks

10 8 6 4 2 0 Final Grade (%)

Notice that the graphs are set up with the same x- and y-axis scales, for direct comparison. They are also similarly sized, so that it is possible to make a direct visual comparison. Class widths of 10 were used, because these are comfortable for marks data, and they allow us to make a distinction between passing and failing grades (assuming 50 is a pass). The marks of the students from Mr. Mean’s class are generally higher and less variable than the marks of the students from Ms. Nice’s class. Half of the students from Ms. Nice’s class failed the course, while only two of the students from Mr. Mean’s class failed.

123

15. An appropriate histogram is shown below.

Downtown Automotive, Random Sample of Daily Sales 12

Number of Days

10 8 6 4 2 0 Daily Sales

124

Marks from First and Second Year for a Random Sample of Students 120

Statistics Mark (%)

100 80 60 40 20 0 0

100

120

Business Math Mark (%)

There does appear to be a positive relationship between the two marks. 19. (Choosing an appropriate class width for comparison takes some thought. $10,000 is probably too wide (resulting in only 4 classes), and $5,000 is probably too narrow. A class width of $7,500 was used for the two histograms shown on the next page. Because the samples are of different size, relative frequencies should be used for comparison.)

125

Percentage of Male Customers

Survey of Drugstore Customers, Annual Incomes of Males 50% 45% 40% 35% 30% 25% 20% 15% 10% 5% 0% Annual Income

Percentage of Female Customers

Survey of Drugstore Customers, Annual Incomes of Females 50% 45% 40% 35% 30% 25% 20% 15% 10% 5% 0% Annual Income

126

21. Two histograms are shown below. Note that samples are the same size, so relative frequencies are not required.

Flight Delays Before Airport Upgrades 16 14

Number of Flights

12 10 8 6 4 2 0 Flight Delay in Minutes

Flight Delays After Airport Upgrades 16 14

Number of Flights

12 10 8 6 4 2 0 Flight Delay in Minutes

The histograms seem to indicate that flight delays have changed after the airport upgrade. Before the upgrade, flight delays were mainly in the 10 to < 40 minute range. Only two delays were less than 10 minutes, and three were more than 40 minutes (but less than 50 minutes). After the upgrades, there were seven delays less than 10 minutes, so a greater number of flights had shorter delays. As well, there was only one flight delayed more than 40 minutes. However, the number of flight delays of 10 < 20 minutes has been reduced from 7 to 3 after the upgrades, while the number of delays of 20 - < 30 minutes has increased from 13 to 14. The greater number of flights with delays less than 10 minutes indicates some reduction in delays, but results appear mixed.

127

The three histograms are shown below. Quarterly Operating Profits 35

Canadian Oil and Gas Extraction and Support Activities, I 1988 to III 2008

Number of Quarters

30 25 20 15 10 5 0 Millions of Dollars

Quarterly Operating Profits 45

Canadian Oil and Gas Extraction and Support Activities, I 1988 to III 2008

Number of Quarters

35 30 25 20 15 10 5 0 Millions of Dollars

Quarterly Operating Profits

Number of Quarters

23.

50 45 40 35 30 25 20 15 10 5 0

Canadian Oil and Gas Extraction and Support Activities, I 1988 to III 2008

Millions of Dollars

All three histograms show the same general shape, that is, the distribution is right-skewed. In most quarters, operating profits in the oil and gas sector were below $1.5 billion, but there were much higher profits in some quarters. In the first histogram the classes may be too narrow, as there are very low frequencies in many of the classes. However, this histogram provides more information about the many quarters when operating

128

profits were low, as there is a breakdown for below $1 billion, and from $1 billion to < $2 billion. This information is hidden in the histogram with the widest classes. It can be a challenge to decide on appropriate class widths when the distribution is very skewed. In the histogram with the widest classes, a lot of data is contained in the first class (half of the data points are there), and so these classes may be a bit wide. However, any one of these histograms would be acceptable. The particular choice depends on the focus of the analysis.

129

Chapter 3 Solutions Develop Your Skills 3.1 1. Σy = 2 + 4 + 6 + 8 = 20 3.

5. a. b. c.

Σy 20 = =5 n 4 Σx 16 = =4 n 4 Consider the data set: 34, 67, 2, 31, 89, 35. For this data set, calculate:

Σx = 34 + 67 + 2 + 31 + 89 + 35 = 258 Σx 2 = 34 2 + 67 2 + 2 2 + 312 + 89 2 + 35 2 = 15756 Σx 258 = = 43 n 6 Σx 2 −

(Σx )2 n

n −1

15756 − =

(258)2 6

15756 − 11094 = 5

4662 = 932.4 = 30.535 5

Develop Your Skills 3.2 7. The mean income is $47,868.10, and the median income is $44,925. This data set is skewed to the right (as we saw when we created the histogram of incomes in Develop Your Skills 2.2, Exercise 7). As a result the unusually high incomes have pulled the mean to the right of the median. The median is the better measure of central tendency. 9.

Because the mean and the median are almost equal, we expect the distribution to be symmetric.

Develop Your Skills 3.3 11. Since the age data are skewed to the right, the IQR is the best measure of variability. Using Excel calculations, we find: Q1 = 31 Q3 = 42 IQR = 11 The Empirical Rule could not be applied here, as the data are not symmetric and bell-shaped. 13. Since this data set is fairly symmetric with no obvious outliers, the standard deviation is the preferred measure of variability. 2 ( Σx ) Σx − 2

n −1

653 2 18381 − 25 = 18381 − 17056.36 = 1324.64 = 55.19333 = 7.429 = 24 24 24

130

15. The mean number of daily customers at Downtown Automotive is 26.12 (calculated for Develop Your Skills 3.2, Exercise 8). The standard deviation is 7.43 (the answer to Exercise 13 above). The Empirical Rule says that about 95% of the data points will lie within 2 standard deviations of the mean. x + 2s = 26.12 + 2(7.43) = 40.98 x + 2s = 26.12 - 2(7.43) = 11.26 If this sample is representative of the population, then 95% of the daily customer counts will be between 11.26 and 40.98. Since the data set is (more or less) symmetric, this means about 2½% of the data will lie below 11.26, and about 2½% will lie above 40.98. About 97.5% of the time, the maximum number of customers Doug would need to plan for is 41. Develop Your Skills 3.4 17. The only choice is b (-0.88). Choices a and c are incorrect, because they are positive and the relationship is clearly negative. Choice d is not correct, because the negative relationship is obviously fairly strong. 19. These are quantitative data, and the graph created for Develop Your Skills 2.5, Exercise 24 shows a linear relationship. The Pearson r is the correct measure of association. Excel calculates it at -0.67 (note that you must check for linearity of the relationship before you calculate the Pearson r). There is a negative relationship between the two variables. The greater the number of hours of paid employment during the semester, the lower the semester average mark. Chapter Review Exercises 1. The mean mark is quite a bit higher than the median mark. This suggests that the distribution of marks is skewed to the right. It is likely that there are a few unusually high marks in the distribution. 3.

The mean price of the inkjet printer cartridges is $26.93. The median price is $25.95.

131

11. This data set is reasonably symmetric and bell-shaped. The mean weekly sales for the sample of stores trying out the new marketing approach are $5101, with a standard deviation of $325.60. Applying the Empirical Rule, almost all of the sales would be between $4124.28 and $6077.85.

Weekly Sales for Stores with New Marketing Approach

Number of Stores

5 4 3 2

1 0 Weekly Sales

13. Since there are many tied values, it is a bit of a challenge to do the ranking process. The results are as follows. Ratings by Customers

Rank

Ratings by Bosses

Rank

7.5

9.5

4.5

1.5

4.5

9.5

1.5

7.5

4.5

132

17. The customer incomes are skewed to the right, with a few incomes much higher than the rest in the data set. Therefore, the median and the interquartile range are the appropriate measures. The median income of the drugstore customers is $44,925. The interquartile range is 15,050 (Excel) or 15,762.5 (by hand). 19. First, we must examine the shape of the distribution. A histogram shows a reasonably symmetric data set (see below).

Contents of a Sample of Soup Cans 14

Number of Cans

12 10 8 6 4 2 0 Contents in Millilitres

133

Chapter 4 Solutions Develop Your Skills 4.1 1a. Sample space: 246 employees commute more than 40 km by car 350-246=104 commute ≤ 40 km by car P(randomly selected employee commutes > 40 km by car) = 246/350 = 0.7029 1b.

Sample space: 150 employees arrange rides with others 350-150=200 ride alone P(randomly selected employee arranges rides with others) = 150/350 = 0.4286 Note that the sample space need not be more complicated than necessary. A full description of the sample space, for both worker characteristics (commuting distance and arranging rides) could look as follows.

Commute > 40 km Commute ≤ 40 km Totals 3.

Commuting Characteristics of Car Part Manufacturing Plant Arrange Rides With Others Ride Alone Totals 246-135=111 135 246 104-65=39 200-135=65 350-246=104 150 350-150=200 350

134

Develop Your Skills 4.2 7. P(honey-nut flavour GIVEN family size box)

P(honey−nut & family size ) 180 = = 0.5714 P( family size ) 315

To check for independence, compare this with P(honey-nut flavour) = 315 + 180

900

= 0.55

Since the two probabilities are NOT equal, the events are NOT independent, that is, the size of the box and the flavour are related, in terms of sales. 9.

It is easiest to proceed if we first compute row and column totals for the table. Accounts Receivable for a Roofing Company amount age ≤$5,000 $5,000 - <$10,000 ≥$10,000 < 30 days 12 15 10 30 - <60 days 7 11 2 60 days and over 3 4 1 total 22 30 13

total 37 20 8 65

For the accounts receivable at this roofing company: P(< 30 days ) = 37/65 = 0.5692 P( < 30 days ⎪ ≤$5000) = 12/22 = 0.5455 Since the two probabilities are not equal, account age and amount are not independent. Develop Your Skills 4.3 11. We are told that the two employees live in different parts of the city, and so presumably could not be held up by the same traffic problems. Assume that each employee’s lateness is independent of the other’s lateness. Then P(both are late) =P(Jane is late and Oscar is late) =0.02 • 0.04 = 0.0008 The probability is low that both Jane and Oscar will be late for work. 13. P(game aimed at 15-25 year olds succeeding) = 0.34 P(accounting program for small business succeeding) = 0.12 P(payroll system for government organizations succeeding) = 0.10 We are told to assume that the events are independent. P(all three succeed) = 0.34 • 0.12 • 0.10 = 0.00408 For calculation of at least two out of three succeeding, we need to think about what this means, in terms of the sample space. At least two out of three succeeding means exactly two out of the three succeeding, or all three succeeding. A tree diagram might be helpful to picture this. We need to calculate and add the probabilities for the cases shown in bolded letters on the right-hand side of the tree diagram.

135

0.12

0.10

SSS

0.90

SSF

0.10

SFS

F payroll system

SFF

0.10

FSS

0.90

FSF

0.10

0.90

S 0.88

0.34

F 0.90 accounting program

game

0.66

0.12 F 0.88

FFS FFF

Once we have the cases identified, it is just a matter of arithmetic. P(at least two out of three succeed) =(0.34 • 0.12 • 0.10)+(0.34 • 0.12 • 0.90)+(0.34 • 0.88 • 0.10)+(0.66 • 0.12 • 0.10) =0.00408+0.03672+0.02992+0.00792=0.07864 15. P(hourly worker or only high school education) =P(hourly worker) + P(only high school education) – P(hourly worker and only high school education) = (790+265+2)/1345 + (790+7+1+0)/1345 – 790/1345 = (790 + 265 + 2 + 7 + 1)/1345 = 1065/1345 = 0.7918 Chapter Review Exercises 1. P(account paid early)

119 = 0.1587 750

P(account paid on time)

320 = 0.4267 750

P(account paid late)

200 = 0.2667 750

136

P(account uncollectible)

= 3.

111 = 0.1480 750 P(employee has Business diploma given she is a woman)

25 = 0.625 40

To test to see if gender and possession of a Business diploma are related for GeorgeConn employees, we can compare the probability above to P(employee has a Business diploma)

55 = 0.55 100

Start by totalling the rows and columns of the table. This will speed up the probability calculations.

P(primary skill is bookkeeping)

= b.

30 = 0.30 100 P(employee has less than one year of experience)

= c.

50 = 0.50 100 P(primary skill is reception)

= d.

25 = 0.25 100 P(employee has one to two years of experience)

= e.

23 = 0.23 100 P(primary skill is document management)

= f.

45 = 0.45 100 P(employee has more than two years of experience)

27 = 0.27 100

137

P(salesperson will exceed targets two years in a row) = P(salesperson exceeds target this year and exceeds target next year) = P(exceeds target this year) • P(exceeds target next year ⎪ exceeds target this year) = 0.78 • 0.15 = 0.0117 A tree diagram is helpful:

P(P)=0.75

P P(P)=0.75

P(F)=0.25

P(P⎪F)=0.90

P(F and P) =0.25 • 0.90 =0.225

F P(F⎪F)=0.10

P(F and F) =0.25 • 0.10 =0.025

P(pass with no more than two attempts) = P(pass the first time) + P(fail the first time and pass the second time) = 0.75 + 0.225 = 0.975

138

11. R: market rises RC: market does not rise P: newsletter predicts rise PC: newsletter predicts market will not rise

P(P⎪R)=0.70

P(R and P) =0.60 • 0.70 =0.42

R P(R)=0.60

P(RC)=0.40

P(R and PC) =0.60 • 0.30 =0.18

P(P ⎪R)=0.30

P(P⎪RC)=0.30

P(RC and P) =0.40 • 0.30 =0.12

P(RC and PC) =0.40 • 0.70 =0.28

RC C

P(P ⎪R )=0.70

P(correct prediction) = P(market rises and newsletter predicts rise) + P(market does not rise and newsletter predicts market will not rise) = 0.42 + 0.28 (from tree diagram) = 0.70 13. If we can identify one situation where gender and tendency to use the health facilities are related, we can say that gender and the tendency to use health facilities are related. Compare P(used the facilities) with P(used the facilities ⎪ male) P(used the facilities) = 210/350 = 0.6 P(used the facilities ⎪ male) = 65/170 = 0.3824 Since these two probabilities are not equal, gender and tendency to use the health and fitness facilities are not independent (that is, they are related).

139

15.

One of the ways to test for independence (or lack of it) is as follows. P(purchased the product)

228 = 0.76 300

P(purchased the product given saw the TV ad)

152 152 = 0.8085 = 152 + 36 188

These two probabilities are not equal, so purchasing behaviour is related to seeing the TV ad. Those who saw the ad were more likely to purchase the product. 17. I: Canadian adult has taken instruction in canoeing IC: Canadian adult has not taken instruction in canoeing CT: Canadian adult is going on a canoe trip this summer CTC: Canadian adult is not going on a canoe trip this summer

P(CT ⎪I)=0.46

P(I and CT) =0.03 • 0.46 =0.0138

I P(I)=0.03

P(CTC ⎪I)=0.54

P(CT ⎪IC)=0.20 P(IC)=0.97

CTC CT

P(IC and CT) =0.97 • 0.20 =0.194

CTC

P(IC and CTC) =0.97 • 0.80 =0.776

IC P(CTC ⎪IC)=0.80

P(I and CTC) =0.03 • 0.54 =0.0162

19. P(a randomly-selected customer from one of these stores uses a cash/debit card or a credit card for payment) = (150 + 180)/500 = 0.66 21. For people who visit the facility: P(buy a membership) = 0.40 P(buy a membership and sign up for fitness classes) = 0.30 P(fitness classes ⎪ bought a membership) =0.30/0.40 = 0.75

140

23. If we can identify one situation where gender and type of alcoholic drink are related, we can say that they are related in general. However, there is no case where gender and type of alcoholic drink are related. P(wine) = (36 + 54)/(42 + 63 + 36 + 54 + 22 + 33)=90/250=0.36 P(wine ⎪ female) = 54/(63 + 54 + 33) = 0.36 P(wine ⎪ male) = 36/(42 + 36 + 22) = 0.36 So we can see that P(wine) = P(wine ⎪ female) = P(wine ⎪ male). Similarly, P(beer) = P(beer ⎪ female) = P(beer ⎪ male). As well, P(other alcoholic drinks) = P(other alcoholic drinks ⎪ female) = P(other alcoholic drinks ⎪ male). We cannot identify a situation where gender and type of alcoholic drink are not independent, so we conclude that gender and type of alcoholic drink ordered are independent in this sample. 25. In this case, we are stuck. We have only one probability (25%), but we do not have independent events. Once the first randomly-selected Canadian is asked about RRSP plans, he/she is removed from further consideration. Depending whether this person plans to make an RRSP contribution over the next year, this will affect the 25% probability of making a contribution. However, it will not affect it very much, because there are many millions of Canadians. So, although the events are not really independent, we can still use the probability as if they were. P(all four intend to contribute to their RRSPs over the next year) = 0.25 • 0.25 • 0.25 • 0.25 = 0.0039

141

142

0.10

P(SSS)=0.00408

0.90

P(SSF)=0.03672

0.10

P(SFS)=0.02992

0.12 S 0.88

0.34

F 0.90 accounting program

game

P(SFF)=0.26928 F payroll system

0.10

P(FSS)=0.00792

0.90

P(FSF)=0.07128

0.10

0.90

0.66

0.12 F 0.88

P(FFS)=0.05808 P(FFF)=0.52272

P(x=0) = 0.52272 P(x=3) = 0.00408 P(x=1) = 0.26928 + 0.07128 + 0.05808 = 0.39864 P(x=2) = 1- 0.52272 - 0.00408 - 0.39864 = 0.07456 x P(x)

0 0.52272

1 0.39864

2 0.07456

3 0.00408

The expected number of successes is µ = 0(0.52272) + 1(0.39864) + 2(0.07456) + 3(0.00408) = 0.56

143

x P(x)

No. of customers who order the daily special at a restaurant, out of the next 6 customers 0 1 2 3 4 5 0.03 0.05 0.28 0.45 0.12 0.04

6 0.03

P(x=6) = 1 – 0.03 – 0.05 – 0.28 – 0.45 – 0.12 – 0.04 = 0.03

µ = Σx • P ( x ) = 0 • 0.3 + 1 • 0.05 + 2 • 0.28 + 3 • 0.45 + 4 • 0.12 + 5 • 0.04 + 6 • 0.03 = 2.82 σ = Σx 2 P ( x ) − µ 2 = (0 2 • 0.03 + 12 • 0.05 + 2 2 • 0.28 + 3 2 • 0.45 + 4 2 • 0.12 + 5 2 • 0.04 + 6 2 • 0.03) − 2.82 2 = 9.22 − 7.9524 = 1.2676 = 1.1259 Develop Your Skills 5.2 7. P(pass) =P(x≥13, n=25, p=1/5) =1 - P(x≤12) =1- 1 = 0 (using the tables) With Excel, we get the slightly more accurate result of 0.000369048. Whichever, it would be basically impossible to pass this test by guessing. Does this result change any ideas you might have had that multiple “guess” tests are easy? 9.

Develop Your Skills 5.3 11. µ = 5,000, σ = 367

P( x ≥ 6000)

6000 − 5000 ⎞ ⎛ = P⎜ z ≥ ⎟ 367 ⎝ ⎠ = P(z ≥ 2.72) = 1 − 0.9967 = 0.0033 Only 0.33% of the bulbs will last more than 6000 hours. With Excel, we find P(x ≥ 6000) = 0.0032.

144

13.

µ = $53, σ = $9 a. Need to calculate P(x < $38)

P ( x < 38) 38 − 53 ⎞ ⎛ = P⎜ z < ⎟ 9 ⎠ ⎝ = P ( z < −1.67 ) = 0.0475

The probability that a randomly-selected warranty expense for these bikes would be less than $38 is 0.0475. With Excel, we find P(x < 38) = 0.0478.

b. Need to calculate P(38 ≤ x ≤ 62).

P(38 ≤ x ≤ 62)

62 − 53 ⎞ ⎛ 38 − 53 = P⎜ ≤z≤ ⎟ 9 ⎠ ⎝ 9 = P(−1.67 ≤ z ≤ 1) = 0.8413 − 0.0475 = 0.7938 The probability that a randomly-selected warranty expense for these bikes would be between $38 and $62 is 0.7938. With Excel, we find P(38 ≤ x ≤ 62) = 0.7936. c.

P( x > 68) 68 − 53 ⎞ ⎛ = P⎜ z > ⎟ 9 ⎠ ⎝ = P(z > 1.67 ) = 1 − 0.9525 = 0.0475 The probability that a randomly-selected warranty expense for these bikes would be above $68 is 0.0475. With Excel, we find P(x > 68) = 0.0478.

15.

µ = 32 seconds, σ = 10 seconds We want to find an x-value with 10% probability below it. Search the body of the normal table for a value as close as possible to 0.10. The closest value is 0.1003 and the associated z-score is -1.28. x=µ+z•σ x = 32 – 1.28•10 = 19.2 Workers must be able to finish the task in 19.2 seconds to escape the weekend training. With Excel, we calculate 19.18448, or 19.2 seconds (to one decimal place).

145

Chapter Review Exercises Solutions provided are based on the tables and by-hand calculations. Answers based on Excel will be more accurate, and may differ slightly from those arrived at with manual calculations. 1a. The number of magazines subscribed to by a Canadian household is neither a normal nor a binomial random variable. It is not binomial, because there are more than two possible outcomes. The number of magazine subscriptions could range from 0 to some highest possible number. The random variable is discrete, as the possible values are 0, 1, 2, 3, … n. As well, it is unlikely that the probability distribution could be approximated by the normal distribution, as it is likely to be rightly-skewed. That is, a few households are likely to subscribe to a higher number of magazines, while most households probably subscribe to only a few. Remember, not all random variables are either normal or binomial! 1b. In this case, the random variable would be binomial. There are only two possible outcomes: either the household subscribes to one or more magazines (success) or they do not subscribe to any magazines (failure). The poll would report the number of successes in 1,235 trials. The trials are not strictly independent, as sampling would be done without replacement. However, 1,235 households represent only a small portion of all Canadian households (definitely less than 5%), so the distribution will still be approximately binomial. 1c. The annual expenditure by Canadian households on magazine subscriptions is actually a discrete random variable (all possible values are in dollars and cents, and a value like $123.47869 is not possible). However, as discussed in the text, dollar amounts are often approximated by the normal distribution. As noted above, though, this distribution may not be normal. It is likely that most households subscribe to only a few magazines, leading to lower expenditures, while a few households might subscribe to many (or more expensive) magazines. Without some actual data, it is difficult to know the shape of the distribution. 3.

0.20

RRR

0.80

RRRC

0.20

RRCR

RRCRC

R 0.80

0.20

RC 0.80 2nd student

1st student

3rd student 0.20

RCRR

0.80

RCRRC

0.20

0.80

0.20 RC 0.80

R CR CR R CR CR C

146

0.40

S 0.60

0.40

2nd trial

1st trial

0.60

0.40 F 0.60

Probability Distribution for Binomial Random Variable, n=2, p=0.4 x 0 1 2 P(x) 0.36 0.48 0.16 Expected value = np = 2 • 0.4 = 0.8 Mean = 0 • 0.36 + 1 • 0.48 + 2 • 0.16 = 0.8

147

µ = 65, σ = 12

89.44% of the class passed.

P( x ≥ 50) 50 − 65 ⎞ ⎛ = P⎜ z ≥ ⎟ 12 ⎠ ⎝ = P( z ≥ −1.25) = 1 − 0.1056 = 0.8944

(Excel answer is the same.) b.

4.75% of the class received a mark of 45% or lower.

P( x ≤ 45) 45 − 65 ⎞ ⎛ = P⎜ z ≤ ⎟ 12 ⎠ ⎝ = P( z ≤ −1.67) = 0.0475

(Excel answer is 0.0478.) c.

69.11 % of the class received a mark between 50% and 75%.

P(50 ≤ x ≤ 75) 75 − 65 ⎞ ⎛ 50 − 65 = P⎜ ≤z≤ ⎟ 12 ⎠ ⎝ 12 = P( −1.25 ≤ z ≤ 0.83) = 0.7967 − 0.1056 = 0.6911

(Excel answer is 0.6920.) d.

1.88% of the class received a mark of 90% or higher.

P( x ≥ 90) 90 − 65 ⎞ ⎛ = P⎜ z ≥ ⎟ 12 ⎠ ⎝ = P( z ≥ 2.08) = 1 − 0.9812 = 0.0188

(Excel answer is 0.0186.)

148

9. a.

This is a normal probability problem. µ = 840, σ = 224

P ( x ≥ 1000) 1000 − 840 ⎞ ⎛ = P ⎜ z ≥ ⎟ 224 ⎝ ⎠ = P ( z ≥ 0.71) = 1 − 0.7611 = 0.2389

The probability that the printer produces more than 1,000 pages before this cartridge needs to be replaced is 0.2389. b.

P( x < 600) 600 − 840 ⎞ ⎛ = P⎜ z < ⎟ 224 ⎠ ⎝ = P( z < −1.07 ) = 0.1423

The probability that the printer produces fewer than 600 pages before this cartridge needs to be replaced is 0.1423. c.

We need to calculate an x-value, such that P(x ≥ this x-value) = 0.95. This means there is 0.05 to the left of this x-value. Search the body of the normal table for a value as close as possible to 0.05 (and there is a “tie”—one entry is 0.0495 and one is 0.0505, and both are equally close to 0.05). Rather than approximate, go to Excel. NORMINV tells us that the x-value is 471.6 (the correct z-score is actually -1.64485). 95% of the time, the cartridges will produce at least 471.6 pages.

11. In this case, sampling is done without replacement. Although investment banking is a highly specialized field, “all” investment bankers would be quite a large number. The sample size of 15 is probably much less than 5% of the population, so the binomial distribution can still be used to approximate the probabilities. a.

n=15, p = 0.25 P(x = 15) = P(x ≤ 15) – P(x ≤ 14) = 1 – 1 = 0 (from the table) The probability that all 15 have profited from insider information is 0. If we use Excel, we see that there is a very small probability associated with this outcome (0.00000000093).

n=15, p = 0.25 P(x ≥ 6) = 1 – P(x ≤ 5) = 1 – 0.852 = 0.148 (from the table) The probability that at least 6 have profited from insider information is 0.148.

149

n = 3, p = 0.25

⎛ 3⎞ P( x = 1) = ⎜⎜ ⎟⎟0.2510.75 2 ⎝ 1 ⎠ 3! = 0.2510.75 2 1!•2! = 3(0.25)(0.5625) = 0.4219 The probability that one of the three investment bankers profited from insider information is 0.4219. 13. Normal distribution with µ = $49,879 and σ = $7,088 a.

P( 45000 ≤ x ≤ 50000) 50000 − 49879 ⎞ ⎛ 45000 − 49879 = P⎜ ≤z≤ ⎟ 7088 7088 ⎝ ⎠ = P( −0.69 ≤ z ≤ 0.02 ) = 0.5080 − 0.2451 = 0.2629

The probability of a new graduate receiving a salary between $45,000 and $50,000 is 0.2629. b.

P( x > 55000) 55000 − 49879 ⎞ ⎛ = P⎜ z > ⎟ 7088 ⎝ ⎠ = P( z > 0.72 ) = 1 − 0.7642 = 0.2358 c.

The probably of a new graduate getting a starting salary more than $55,000 is 0.2358. Need to locate a salary such that the area to the left is 90%. Search the body of the normal table for an entry as close as possible to 0.90. The closest entry in the table is 0.8997, which has an associated zscore of 1.28. x=µ+z•σ x = 49,879 + 1.28 • 7,088 = $58,951.64 If you wanted to be earning more than 90% of new college graduates in computer information systems, you would have to earn $58,952.

150

15. normal distribution, µ = 10,000 and σ = 2,525 a. With tables:

P ( x > 12000) 12000 − 10000 ⎞ ⎛ = P ⎜ z > ⎟ 2525 ⎝ ⎠ = P ( z > 0.79 ) = 1 − 0.7852 = 0.2148

The percentage of the flood lamps that would last for more than 12,000 hours is 0.2148. With Excel: P(x>12000) = 1-P(x ≤ 12000) = 1- 0.785843 = 0.214157. b.

With tables: Need to find an x-value such that the probability to the left is 2%. Search the body of the table for an entry as close as possible to 0.02. There is an entry of 0.0202, with an associated z-score of -2.05. x=µ+z•σ x = 10,000 – 2.05 • 2,525 = 4,823.75 The manufacturer would advertise a lifetime of 4,823 hours, and only 2% of them will burn out before the advertised lifetime. Since the guaranteed hours are not a nice round number, the manufacturer may choose to use a value of 4,820 or even 4,800 hours instead. With Excel: Use NORMINV to get 4814.284.

17. normal distribution with µ = $2,400 and σ = $756 a. With tables:

P( x < 1000) 1000 − 2400 ⎞ ⎛ = P⎜ z < ⎟ 756 ⎝ ⎠ = P( z < −1.85) = 0.0322

The proportion of the bills that are less than $1,000 is 3.22%. With Excel: P(x < 1000) = 0.032024. b.

With tables:

P ( x > 1500) 1500 − 2400 ⎞ ⎛ = P ⎜ z > ⎟ 756 ⎝ ⎠ = P ( z > −1.19 ) = 1 − 0.1170 = 0.8830

The proportion of the bills that are more than $1,500 is 88.3%. With Excel: P(x > 1500) = 1 – P(x ≤ 1500) = 1 – 0.11693 = 0.88307.

151

With tables: Need to find an x-value such that the area to the right of it is 0.75. We have to work with left-sided probabilities, so we note that this means there is 0.25 to the left of the x-value. We search the body of the normal table for the value closest to 0.25; it is 0.2512. The associated z-score is -0.67. x=µ+z•σ x = $2,400 – 0.67 • $756 = $1,893.48 75% of the bills are more than $1,893.48. With Excel: Use NORMINV to find $1,890.09.

152

P( x ≤ 39368) 39368 − 40000 ⎞ ⎛ = P⎜ z ≤ ⎟ 554 ⎝ ⎠ = P( z ≤ −1.14) = 0.1271

It would not be highly unlikely to get a sample mean salary as low as $39,368, if the college’s claim about salaries is true. Since this is not an unexpected result, we do not have reason to doubt the college’s claim. 3. Desired characteristic about the population: no more than 25% of employees would enrol in education programs Sample result: sample proportion is 26%, for 500 randomly-selected customers If the population percentage is actually 25%, sample proportions would be normally distributed, with a mean of 0.25 and a standard deviation of 0.021651. The sample proportion is higher than expected. We need to calculate P(SR≥0.26)

P ( pˆ ≥ 0.26) 0.26 − 0.25 ⎞ ⎛ = P ⎜ z ≥ ⎟ 0.021651 ⎠ ⎝ = P ( z ≥ 0.46) = 1 − 0.6772 = 0.3228 The probability of getting a sample proportion as high as 26%, when the population proportion is actually 25%, is over 32%. It would not be unusual to get such a sample proportion, if the actual population proportion is only 25%. The sample does not provide enough evidence to conclude that the actual percentage of employees who would enrol in such programs is more than 25%. On the basis of the sample results, the company should conclude that it can afford the programs and extend the benefit.

153

5. Claim about the population: average commuting time is 32 minutes. Sample result: a random sample of 20 commuters has an average commuting time of 40 minutes. If the true average commuting time is 32 minutes, the sample means would be normally distributed, with a mean of 32 minutes, and a standard deviation of 5 minutes. The sample mean is higher than expected. We need to calculate P( x ≥40).

P( x ≥ 40) 40 − 32 ⎞ ⎛ = P⎜ z ≥ ⎟ 5 ⎠ ⎝ = P ( z ≥ 1 .6 ) = 1 − 0.9452 = 0.0548 The probability of getting a sample average commuting time as high as 40 minutes, if the actual population average commuting time is 32 minutes, is 0.0548. This result gives us pause. Such a sample result is not that usual—it will happen with a probability of only 5.48%. However, the cut-off we have decided to use for deciding what is “unusual” is a probability of 5% or less. This sample result does not meet that test. So, in this case, the sample does not give us enough evidence to conclude that the average commuting time has increased from 32 minutes. You may not be entirely comfortable with this decision. We will discuss this point further, when we talk about p-values (in Chapter 7). For now, stick to the rule, and later, you will find out more about how these decisions are made. Develop Your Skills 6.2 7. Claim about the population: average salary of business program graduates one year after graduation is at least $40,000 a year Sample result: a random sample of 20 salaries of business program graduates one year after graduation has an average $38,000. We are told the salaries are normally distributed, with σ = $3,300. If the true average salary is $40,000, the sample means would be normally distributed, with a mean of $40,000, and a standard deviation of

3300 . 20

P ( x ≤ 38000 ) ⎛ ⎞ ⎜ 38000 − 40000 ⎟ = P ⎜ z ≤ ⎟ 3300 ⎜ ⎟ 20 ⎠ ⎝ = P ( z ≤ −2.71) = 0.0034 If the true average salary were $40,000, it would be almost impossible to get a sample mean as low as $38,000, under these conditions. Such a sample mean provides evidence that the average salary of graduates of the business program one year after graduation is less than $40,000.

154

9. Claim about the population: it takes 1.5 working days on average to approve loan requests. Sample result: a random sample of 64 loan requests has an average of 1.8 working days. We are told the population are normally distributed, with σ = 2.0. If the true average time for a loan to be approved is 1.5 working days, the sample means would be normally distributed, with a mean of 1.5, and a standard deviation of

2 64

P( x ≥ 1.8) ⎛ ⎞ ⎜ 1.8 − 1.5 ⎟ = P⎜ z ≥ ⎟ 2 ⎜ ⎟ 64 ⎠ ⎝ = P ( z ≥ 1 .2 ) = 1 − 0.8849 = 0.1151 If the claim about the average time to approve the loan requests is true, the probability of getting this sample result would be 0.1151. This is not unusual, and so there is not enough evidence to conclude that the bank understates the average amount of time to approve loan requests. Develop Your Skills 6.3 11. This is not really a random sample. It excludes anyone who eats the cereal but does not visit the website set up for the survey. As well, the sample is likely to be biased. People who did not find a free ticket in their cereal box are probably more likely to answer your survey. This sample data set cannot be reliably used to decide about the proportion of cereal boxes with a free ticket. 13. Claim about the population: p = 0.01 (proportion of defective tires is 1%) Sample result: a random sample of 500 tires reveals 8/500 = 1.6% that are defective Sampling is done without replacement. Presumably the company produces hundreds of thousands of tires, so we can be fairly confident that 500 tires is not more than 5% of the total population. The binomial probability distribution is still an appropriate underlying model. Check conditions: np = 500(0.01) = 5 nq = 500(0.99) = 495 np < 10, so the sampling distribution of p̂ should not be used. Using Excel and the binomial distribution, P(x ≥ 8, n = 500, p = 0.01) = 0.132319866 A sample proportion like the one we got would not be unusual, if in fact 1% of the tires are defective. There is not enough evidence to suggest that the rate of defective tires is more than 1%. Depending how the national survey was done, it might have had more response from those with defective tires.

155

15. Claim about the population: p = 0.40 (percentage of retired people who eat out at least once a week) Sample result: a random sample of 150 retired people in your city reveals 44 who eat out at least once a week Sampling is done without replacement. If the city is fairly large, we can presume that 150 retired people are not more than 5% of the total population of retired people. The binomial probability distribution is still an appropriate underlying model. Check conditions: np = 150(0.40) = 60 nq = 150(0.60) = 90 both are ≥ 10 Since n is fairly large, at 150, the sampling distribution of p̂ can be used. Sampling distribution will be approximately normal, with mean = p = 0.40 standard error =

p̂ =

pq (0.40)(0.60) = n 150

44 = 0.29333333 150

P(p̂ ≤ 0.293333) ⎛ ⎞ ⎜ ⎟ 0.293333 − 0.40 ⎟ ⎜ = P⎜ z ≤ (0.40)(0.60) ⎟ ⎜⎜ ⎟⎟ 150 ⎝ ⎠ = P(z ≤ −2.67 ) = 0.0038 It would be very unusual to get a sample result such as this one, if in fact 40% of retired people ate out at least once a week. The sample results suggest that fewer than 40% of retired people eat out at least once a week in your city. However, before deciding whether or not to focus on retired people, it would be important to know if this group tends to eat out more or less than other groups of people. While the percentage of those who eat out more than once a week is apparently lower in your city than in the survey, it still might be higher than for other groups, and might still be a good target market.

156

Chapter Review Exercises 1. Your sketch should look something like the diagram below.

119.5

131.5

143.5

155.5

167.5

179.5

191.5

203.5

215.5

1b. The sampling distribution of the sample means (samples of size 25) will be normally distributed, because the heights in the population are normally distributed. The mean of the sample means will be 167.5 cm, and the standard error will be

12 = 2.4 cm. The sampling distribution of the sample 25

means will therefore be much narrower than the population distribution of heights. It will have to be much taller as well, since the total area under the distribution must be 1 (it is a probability distribution).

119.5

131.5

143.5

155.5

167.5

179.5

191.5

203.5

215.5

157

119.5

131.5

143.5

12 = 1.8974 cm. It will be narrower still. 40

155.5

167.5

179.5

191.5

203.5

215.5

All three distributions are normally distributed, and all have a mean of 167.5 cm. There is greatest variability in the population distribution of heights. The sampling distribution of the means of 25 heights is much less variable, and the sampling distribution of the means of 40 heights is the least variable. p = 0.86 (claimed proportion of those taking the pill who get relief within 1 hour)

p̂ = 287/350 = 0.82 n = 350 (fairly large) Sampling is done without replacement. Presumably, there are thousands and thousands of back pain sufferers who take this medication, so it is still appropriate to use the binomial distribution as the underlying model. Check for normality: np = 350(0.86) = 301 > 10 nq = 350(1-0.86) = 49 > 10 The binomial distribution could be approximated by a normal distribution, and so we can use the sampling distribution of p̂ .

P( pˆ ≤ 0.82 ) ⎛ ⎞ ⎜ ⎟ 0.82 − 0.86 ⎟ ⎜ =P z≤ ⎜ (0.86)(0.14 ) ⎟ ⎜ ⎟ 350 ⎝ ⎠ = P( z ≤ −2.16) = 0.0154

158

The probability of getting a sample result as extreme as the one we got, if the true proportion of sufferers who get relief within one hour is 86%, is 0.0154, which is less than 5%. The sample result qualifies as an unexpected or unusual event. Since we got this sample results, we have enough evidence to suggest that fewer than 86% of back pain sufferers who take this pill get relief within one hour. 5. p = 0.17 (% of Americans whose primary breakfast beverage is milk) p̂ = 102/500 = 0.204 n = 500 (fairly large) Sampling is done without replacement. The sample size is 500. There are millions of Canadians who eat breakfast, and so the sample is certainly not more than 5% of the population. It is appropriate to use the binomial distribution as the underlying model. Check for normality: np = 500 (0.17) =85 > 10 nq = 500 (1 – 0.17) = 415 > 10 The binomial distribution could be approximated by a normal distribution, so we can use the sampling distribution of p̂ .

P ( pˆ ≥ 0.204) 0.204 − 0.17 ⎞ ⎛ = P ⎜ z ≥ ⎟ 0.016799 ⎠ ⎝ = P ( z ≥ 2.02 ) = 1 − 0.9783 = 0.0217 There would not be much of a chance of getting a sample proportion as high as 20.4% if the true proportion of Canadians who drink milk as the primary breakfast beverage were actually 17%. The sample evidence gives us reason to think that the proportion of Canadians who choose milk as their primary breakfast beverage may be higher than the percentage of Americans who do so. 7.

x = $756 σ = $132 µ = $700 (the average cost of textbooks per semester for a college student) n = 75 We are told that the population data are normally distributed, so the sampling distribution will also be normal, with a mean of $500, and a standard error of σ x =

132 . 75

P( x ≥ 756) ⎛ ⎞ ⎜ 756 − 700 ⎟ = P⎜ z ≥ ⎟ 132 ⎜ ⎟ 75 ⎠ ⎝ = P( z ≥ 3.67 ) = 1 − 0.9999 = 0.0001

159

There is almost no chance of getting a sample mean as high as $756 if the population average textbook cost is $700. The sample provides evidence that the average cost of textbooks per semester for college students has increased. Start by analyzing the sample data set.

x = 6188.3875 hours

s = 234.6176 µ = 6200 hours (claimed average lifespan of electronic component) n = 40 Histogram of sample data is shown below.

Lifespans of a Random Sample of 40 Electronic Components Number of Components

18 16 14 12 10 8 6 4 2 0

Lifespan in Hours

234.6176 . 40

We are told to assume σ = s = 234.6176.

P ( x ≤ 6188.3875) ⎛ ⎞ ⎜ 6188.3875 − 6200 ⎟ = P ⎜ z ≤ ⎟ 234.6176 ⎜ ⎟ 40 ⎠ ⎝ = P ( z ≤ −0.31) = 0.3783

The probability of getting a sample result as small as 6188.3875 hours, if the true average lifespan of the components is 6200, is 0.3783. This sample result is not unusually small. The sample evidence does not give us reason to doubt the producer’s claim that the average lifespan of the electronic components is 6200 hours.

160

11. Start by analyzing the data set. x = 1756.48 s = 599.773 µ = $2000 (claimed average daily sales) n = 29

The histogram is skewed to the right. The sample size, at 29, is reasonably large. We will proceed by assuming that with this sample size, the sampling distribution will be approximately normal.

P( x ≤ 1756.48) ⎛ ⎞ ⎜ 1756.48 − 2000 ⎟ = P⎜ z ≤ ⎟ 599.773 ⎜ ⎟ 29 ⎠ ⎝ = P( z ≤ −2.19 ) = 0.0144 The probability of getting average daily sales as low as $1,756.48, if the true average daily sales are $2,000, is quite small, at 1.44%. The fact that we got this unusual sample result gives us reason to doubt the former owner's claim that average daily sales at the shop were $2,000. However, it may be that the sales have changed under the new owner, for a variety of reasons. The data cannot allow us to conclude that the former owner misrepresented the daily sales figures.

161

13.

p = 0.20 n=300 number of successes = 77 (where "success" is defined as a student with a laptop) Sampling is done without replacement. The sample size is 300, and we have no information about the total number of students at the college. We can proceed, first by noting that we are assuming there are at least 300•20=6000 students at the college. We will use the binomial distribution as the underlying model. Check for normality: np = 300(0.20) = 60 > 10 nq = 300(1 – 0.20) = 240 > 10 The binomial distribution could be approximated by a normal distribution, so we can use the sampling distribution of p̂ .

P ( pˆ ≥ 0.2567) 0.2567 − 0.20 ⎞ ⎛ = P ⎜ z ≥ ⎟ 0.023094 ⎠ ⎝ = P ( z ≥ 2.45) = 1 − 0.9929 = 0.0071 It would be almost impossible to find 77 laptops among 300 students, if the claimed percentage ownership was still 20%. Since we found this result, we have evidence to suggest that the percentage of students with laptop computers is now more than 20%.

162

a. H0: µ = 142 ml H1: µ ≠ 142 ml This should be a two-tailed test. Both underfilled and overfilled cans of peaches present problems. b. A Type I error arises when we mistakenly reject the null hypothesis when it is in fact true. This would correspond to concluding the wrong amount of peaches was going into the cans, and making some adjustments, when in fact everything was fine, and no adjustments were necessary. A Type II error arises when we mistakenly fail to reject the null hypothesis when it is in fact false. This would correspond to concluding that the right amount of peaches were going into the cans, and making no adjustments, when in fact the cans were being either underfilled or overfilled, and some adjustment was necessary. c. If I were a consumer the underfilled cans would be most important to me! Type II errors are more important, particularly if they led to underfilled cans of peaches. This could also be a consequence of Type I errors.

This is now a two-tailed test, so the p-value = 2 • 0.3192 = 0.6384.

163

Develop Your Skills 7.2 7. H0: p = 0.3333 H1: p > 0.3333 α = 0.02 p̂ = 0.34 n = 1006 Sampling is done without replacement. The population is Canadian homeowners, of which there are millions, so the sample of 1006 is less than 5% of the population. np = 1006(0.3333) = 335.3333 nq = 1006(1 – 0.3333) = 670.6667 Both are ≥ 10, so the sampling distribution of p̂ will be approximately normal, with a mean of 0.3333, and a standard error of σ p̂ =

pq (0.3333)(0.6667) = = 0.014862599 . n 1006

P(p̂ ≥ 0.34) ⎛ ⎞ ⎜ ⎟ 0.34 − 0.3333 ⎟ ⎜ = P⎜ z ≥ (0.3333)(0.6667) ⎟ ⎜⎜ ⎟⎟ 1006 ⎝ ⎠ = P(z ≥ 0.45) = 1 − 0.6736 = 0.3264 p-value = 0.3264 > α = 0.02 We fail to reject H0. There is insufficient evidence to infer that more than a third of homeowners borrow to renovate.

164

pq (0.1)(0.9) = = 0.02123203 n 200

P(p̂ ≤ 0.09) ⎛ ⎞ ⎜ ⎟ 0.09 − 0.10 ⎟ ⎜ = P⎜ z ≤ (0.1)(0.9) ⎟ ⎜⎜ ⎟⎟ 200 ⎠ ⎝ = P(z ≤ −0.47 ) = 0.3192 p-value = 0.3192 > α = 0.05 We fail to reject H0. There is insufficient evidence to infer that fewer than 10% of customers opt for the extended warranty coverage. Develop Your Skills 7.3 11. H0: µ = $50,000 H1: µ > $50,000 α = 0.03 With Excel, we calculate x = 50356 s = 7962.922669 n = 40 (given) A histogram of the data is somewhat skewed to the right. However, the sample size is fairly large, at 40, and so this is “normal enough” to use the t-distribution.

165

Household Incomes in a Halifax Suburb

Number of Households

12 10 8 6 4 2 0 Annual Income

Since we are using Excel, it makes sense to use the template. Results are shown below.

yes 7962.92 50356.00 40 50000 0.28275 0.38943 0.77886

166

13. H0: µ = $37876 H1: µ < $37876 α = 0.02 Using Excel, we find x = 35238 s = 2752.578754 n = 50 A histogram of the data is shown below.

Salaries of a Random Sample of Entry-‐Level Clerks in the Area 16 14

Number of Clerks

12 10 8 6 4 2 0 Annual Salary

yes 2752.58 35238 50 37876 -‐6.7767 7.4E-‐09 1.5E-‐08

167

We reject H0. There is strong evidence to infer that average salary of clerks in the area is lower than $37,876. In other words, the average salary of entry-level clerks in the company is higher than in the area. 15. H0: µ = $85 H1: µ > $85 α = 0.05 x = $87.43 s = $16.23 n = 15 We are told to assume the population data are normally distributed.

a. H0: µ = 750 ml H1: µ ≠ 750 ml This is a two-tailed test, because there are problems if the water bottles contain either too much or too little water. b. A Type I error occurs when we mistakenly reject H0 when it is in fact true. In this case, this would correspond to concluding that the bottles do not contain the correct amount of water (and probably adjusting something), when in fact they were fine. A Type II error occurs when we fail to reject H0 when it is in fact false. In this case, this would correspond to failing to notice when the bottles did not contain the correct amount of water, which might lead to customer complaints (or lower profits).

168

As a consumer, you would probably be most concerned about Type II errors, particularly when they led to underfilled bottles. Underfilled bottles could also be a consequence of Type I errors.

p − value = 2 • P( pˆ ≥ 0.30) ⎛ ⎞ ⎜ ⎟ 0.30 − 0.26 ⎟ ⎜ = 2•P z ≥ ⎜ (0.26)(0.74 ) ⎟ ⎜ ⎟ 200 ⎝ ⎠ = 2 • P( z ≥ 1.29 ) = 2 • 0.0985 = 0.197 7.

H0: µ = $725 H1: µ < $725 α = 0.05 We are told to assume that the population data are normally distributed.

169

P ( pˆ ≥ 0.204)

⎛ ⎞ ⎜ ⎟ 0.204 − 0.17 ⎟ ⎜ =P z≥ ⎜ (0.17 )(0.83) ⎟ ⎜ ⎟ 500 ⎝ ⎠ = P ( z ≥ 2.02 ) = 1 − 0.9783 = 0.0217 This is a one-tailed test. The p-value =0.0217 > α = 0.01. Fail to reject H0. There is not enough evidence to conclude that the proportion of Canadians who drink milk as the primary breakfast beverage is greater than 17%, the percentage of Americans who drink milk as the primary breakfast beverage. Note that this is not the same conclusion we drew in Chapter 6, where the implied level of significance was 5%. In this exercise, we have used a smaller level of significance. Under these new conditions, it is harder to reject the null hypothesis.

170

11.

H0: p = 0.05 H1: p > 0.05 α = 0.02 Sampling is done without replacement, and sample size is 500. As long as the sample of 500 is not more than 5% of the total population of employees, it is appropriate to use the binomial distribution as the underlying model. We proceed by noting that we are making this assumption, and that the conclusions are not valid if this assumption is not correct. np = 500 (0.05) = 25 nq = 500 (1 – 0.05) = 475 Both are ≥ 10, so the sampling distribution of p̂ will be approximately normal.

p̂ = 38/500 = 0.076

171

13.

H0: p = 0.5 H1: p > 0.5 α = 0.05 Sampling is done without replacement, and sample size is 500. As long as the sample of 500 is not more than 5% of the total population of customers, it is appropriate to use the binomial distribution as the underlying model. We proceed by noting that we are making this assumption, and that the conclusions are not valid if this assumption is not correct. n = 20 + 47 + 32 + 15 + 9 = 123 np = 123 (0. 5) = 61.5 nq = 123 (1 – 0. 5) = 61.5 Both are ≥ 10, so the sampling distribution of p̂ will be approximately normal.

p̂ = (20 + 47)/123 = 0.54472

P( pˆ ≥ 0.54472 ) ⎛ ⎞ ⎜ ⎟ 0.54472 − 0.5 ⎟ ⎜ =P z≥ ⎜ (0.5)(0.5) ⎟ ⎜ ⎟ 123 ⎝ ⎠ = P( z ≥ 0.99 ) = 1 − 0.8389 = 0.1611 p-value = 0.1611 > α = 0.05 We fail to reject H0. There is not enough evidence to infer that more than half of customers agreed or strongly agree that the staff at the local branch can provide good advice on their financial affairs.

172

This data set was examined for Chapter 6 Review Exercise 9. The sample data appeared to be approximately normally distributed.

Lifespans of a Random Sample of 40 Electronic Components Number of Components

15.

18 16 14 12 10 8 6 4 2 0

Lifespan in Hours

The Excel template for this data set is shown below.

yes 234.6175952 6188.3875 40 6200 -‐0.3130366 0.3779604 0.7559208

173

17.

H0: p = 0.25 H1: p < 0.25 α = 0.05 Sampling is done without replacement, and sample size is 2400. However, the population is all Canadians, so the sample is definitely less than 5% of the population. It is appropriate to use the binomial distribution as the underlying model. n = 2400 Use Excel's Histogram tool to organize the data. The output is shown below. The total was calculated with an Excel formula. In the data set, 0 = not interested, 1 = not sure, 2 = interested, 3 = very interested.

Bin

Total

Frequency 0 1197 1 635 2 377 3 191 2400

The completed Excel template for this data set is shown below.

2400 0.25 600 1800 yes 0.23667 -‐1.5085 0.06571 0.13143

np = 600 nq = 1800 Both are ≥ 10, so the sampling distribution of p̂ will be approximately normal.

p̂ = (377 + 191)/2400 = 0.23667 This is a one-tailed test, so the p-value = 0.06571 > α = 0.05 Fail to reject H0. There is not enough evidence to infer that fewer than one quarter of all Canadians are extremely interested or very interested in having smart meters installed in their homes.

174

19. First, the data must be analyzed. Using Excel’s Histogram tool from Data Analysis, we discover the following. Survey of Drugstore Customers, Ratings of Staff Friendliness Percentage Number of of Rating Customers Customers Excellent 15 30% Good 23 46% Fair 10 20% Poor 2 4% Total 50 H0: p = 0.05 H1: p < 0.05 α = 0.04 Sampling is done without replacement. The sample size of 50 is probably not more than 5% of the total customer base of the drugstore. np = 50 (0.05) = 2.5 nq = 50 (1 – 0.05) = 47.5 Since np is not ≥ 10, the sampling distribution of p̂ will not be approximately normal. Instead, we use Excel and the binomial distribution. In the sample, 2 customers rated staff friendliness as poor. P(x ≤ 2, n = 50, p = 0.05) = 0.540533 We fail to reject H0. There is insufficient evidence to infer that fewer than 5% of the customers rate staff friendliness as poor.

175

H0: µ = 40 H1: µ ≠ 40 α = 0.05 First we examine the data. A histogram of the data set is shown below.

Survey of Drugstore Customers, Customer Ages 30

Number of Customers

21.

25 20 15 10 5 0 Age

This histogram is extremely skewed to the right, and is too skewed for us to proceed. We cannot proceed with the analysis with the tools currently at our disposal.

176

23. As usual, we must examine the data before we proceed. A histogram is shown below.

Random Sample of City Households

Number of Families

120 100 80 60 40 20 0 After-‐Tax Incomes for Families of Two or More People

177

p̂ ± (critical z−score)

p̂q̂ n

⎛ 56 ⎞⎛ 244 ⎞ ⎜ ⎟⎜ ⎟ 56 ⎝ 300 ⎠⎝ 300 ⎠ ± 2.576 300 300 0.1866667 ± 2.576(0.02249609) (0.1287, 0.2446) A 99% confidence interval estimate for the proportion of smokers is (0.1287, 0.2446). The Excel template confirms this, with slightly more accurate results.

Confidence Interval Estimate for the Population Proportion Confidence Level (decimal form) 0.99 Sample Proportion 0.18667 Sample Size n 300 np-‐hat 56 nq-‐hat 244 Are np-‐hat and nq-‐hat >=10? yes Upper Confidence Limit Lower Confidence Limit

0.24461 0.12872

178

First, check conditions. Sampling is done without replacement. However, the sample size of 1000 is certainly no more than 5% of all Canadians. n p̂ = 1000(0.48) = 480 n q̂ = 1000(1-0.48) = 520 Both are ≥ 10, so we can proceed.

pˆ ± ( critical z−score)

pˆ qˆ n

(0.48)(0.52 ) 1000 0.4 ± 1.645 (0.015798734) (0.4540, 0.5060) 0.48 ± 1.645

Confidence Interval Estimate for the Population Proportion Confidence Level (decimal form) Sample Proportion Sample Size n np-‐hat nq-‐hat Are np-‐hat and nq-‐hat >=10? yes Upper Confidence Limit Lower Confidence Limit 5.

0.9 0.48 1000 480 520

0.505986605 0.454013395

First, record the information given. n = 1,202

p̂ = 469/1202 = 0.390183 confidence level = 19/20 = 0.95 The result is accurate to with 2.9 percentage points. This means the half-width of the confidence interval is 0.029. The 95% confidence interval will be (0. 390183 – 0.029, 0. 390183 + 0.029) (0.361, 0.419) The polling company has 95% confidence that the interval (36.1%, 41.9%) contains the percentage of British Columbia residents who think that retailers should provide biodegradable plastic bags to consumers at no charge.

179

Develop Your Skills 8.2 7. Because the sample data set is clearly non-normal and highly skewed to the right, the necessary conditions are not met, and we cannot construct a confidence interval. First, check the sample data for normality. One possible histogram is shown below.

Random Sample of Marks on a Statistics Test 6 5 Number of Marks

4 3 2 1 0 Mark (%)

Since the histogram is fairly normal, we proceed. A completed Excel template is shown below.

Confidence Interval Estimate for the Population Mean Do the sample data appear to be normally distributed? Confidence Level (decimal form) Sample Mean Sample Standard Deviation s Sample Size n

yes

Upper Confidence Limit Lower Confidence Limit

0.95 59.8 15.5381 20 67.072 52.528

A 95% confidence interval estimate for the average grade on the statistics test is (52.5, 67.1).

180

Develop Your Skills 8.3 11. confidence level is 99%, so z-score = 2.576 HW = 0.05

p̂ = 0.1867 Substitute these values into the formula:

⎛ z − score ⎞ n = pˆ qˆ ⎜ ⎟ ⎝ HW ⎠

⎛ 2.576 ⎞ n = (0.1867)(1 − 0.1867)⎜ ⎟ ⎝ 0.05 ⎠ = (0.15184311)(2654.31)

= 403.03

A sample size of 404 is required to estimate the proportion of smokers on staff to within 5%, with 99% confidence. 13. For 95% confidence, the z-score is 1.96. HW = $10 s = $32.45

⎛ ( z − score)( s ) ⎞ n = ⎜ ⎟ HW ⎝ ⎠ ⎛ (1.96)(32.45) ⎞ = ⎜ ⎟ 10 ⎝ ⎠ = 40.5

A sample size of 41 is necessary to estimate the average grocery bill of households who shop at this supermarket to within $10, with 95% confidence. 15. We will use an estimate of p̂ of 0.5, since no other information is given. For 96% confidence, the zscore is 2.05. HW = 0.04

⎛ z − score ⎞ n = pˆ qˆ ⎜ ⎟ ⎝ HW ⎠

⎛ 2.05 ⎞ n = (0.5)(1 − 0.5)⎜ ⎟ ⎝ 0.04 ⎠ = (0.25)(2626.5625)

= 656.64 A sample size of 657 would have to be taken to estimate the percentage of Canadian internet users who visit social networking sites, to within 4%, with 96% confidence. Develop Your Skills 8.4 17. A 99% confidence interval estimate for the proportion of smokers is (0.1287, 0.2446). Since this contains 20%, we do not have sufficient evidence to reject the nurse’s claim that 20% of the staff are smokers. A 1% level of significance applies.

181

n = 400 p̂ = 0.26 First, check conditions. Sampling is done without replacement. The population is all Canadian grocery shoppers, so the sample of 400 is definitely ≤ 5% of the population. n p̂ = 400(0.26) = 104 n q̂ = 400(0.74) = 296 Both are ≥ 10, so we can proceed.

pˆ qˆ n

pˆ ± ( critical z−score)

(0.26)(0.74 ) 400 0.26 ± 1.645(0.021932) 0.26 ± 1.645

(0.2239, 0.2961) A 90% confidence interval estimate for the proportion of all Canadian grocery shoppers who are trying to make healthier choices is (0.2239, 0.2961). b.

pˆ qˆ n

pˆ ± ( critical z−score) 0.26 ± 2.05(0.021932) (0.2150, 0.3050)

A 96% confidence interval estimate for the proportion of all Canadian grocery shoppers who are trying to make healthier choices is (0.2150, 0.3050). c.

pˆ qˆ n

pˆ ± ( critical z−score) 0.26 ± 2.576(0.021932) (0.2035, 0.3165)

A 99% confidence interval estimate for the proportion of all Canadian grocery shoppers who are trying to make healthier choices is (0.2035, 0.3165).

182

As the desired level of confidence increases, the confidence interval gets wider.

s = 521 For 95% confidence, we need to identify t.025 for 520 degrees of freedom. t.025 = 1.972 for 200 degrees of freedom, the closest we could get from the tables.

⎛ s ⎞ x ± ( critical t −score)⎜ ⎟ ⎝ n ⎠ ⎛ 521 ⎞ 1576 ± (1.972)⎜ ⎟ ⎝ 212 ⎠ 1576 ± 1.972(35.7824) (1505, 1647) A 95% confidence interval estimate for the average number of web pages visited per month by Canadian internet users is (1505, 1647). b.

s = 321 For 95% confidence, we need to identify t.025 for 520 degrees of freedom. t.025 = 1.972 for 200 degrees of freedom, the closest we could get from the tables.

⎛ s ⎞ x ± ( critical t −score)⎜ ⎟ ⎝ n ⎠ ⎛ 321 ⎞ 1576 ± (1.972)⎜ ⎟ ⎝ 212 ⎠ 1576 ± 1.972( 22.0464) (1533, 16197 ) A 95% confidence interval estimate for the average number of web pages visited per month by Canadian internet users is (1533, 1619). c.

s = 201 For 95% confidence, we need to identify t.025 for 520 degrees of freedom. t.025 = 1.972 for 200 degrees of freedom, the closest we could get from the tables.

⎛ s ⎞ x ± ( critical t −score)⎜ ⎟ ⎝ n ⎠ ⎛ 201 ⎞ 1576 ± (1.972)⎜ ⎟ ⎝ 212 ⎠ 1576 ± 1.972(13.8047) (1549, 1603) A 95% confidence interval estimate for the average number of web pages visited per month by Canadian internet users is (1549, 1603). d.

As the variability in the data decreases, the confidence intervals become narrower. They do not have to be so wide, because the distributions are not so wide.

183

We have no estimate for p̂ , so we will use p̂ = 0.5. confidence level is 90%, so z-score = 1.645 HW = 0.02 Substitute these values into the formula:

⎛ z − score ⎞ n = p̂q̂⎜ ⎟ ⎝ HW ⎠

⎛ 1.645 ⎞ n = (0.5)(1 − 0.5)⎜ ⎟ ⎝ 0.02 ⎠ = (0.25)(6765.0625)

= 1691.3

A sample size of 1692 is required to estimate the proportion of new graduates of a Business program who are willing to relocate to find a job. Since your college graduates only about 350 students from the Business program, this indicates that the entire population should be surveyed for the desired level of accuracy. 7.

⎛ s ⎞ x ± ( critical t −score)⎜ ⎟ ⎝ n ⎠ ⎛ 3.2 ⎞ 54.2 ± ( 2.756)⎜ ⎟ ⎝ 30 ⎠ 54.2 ± 1.610158 (52.59, 55.81) A 99% confidence interval estimate for average hours of work per week for these employees is (52.59, 55.81 hours). 9.

confidence level is 95%, so z-score = 1.96 HW = 0.02

p̂ = 0.10 Substitute these values into the formula:

⎛ z − score ⎞ n = pˆ qˆ ⎜ ⎟ ⎝ HW ⎠

⎛ 1.96 ⎞ n = (0.1)(1 − 0.1)⎜ ⎟ ⎝ 0.02 ⎠ = (0.09 )(9604)

= 864.36

A sample size of 865 is required to estimate the percentage of the adult population in Canada who would consider buying a hybrid vehicle for their next purchase, to within 2%, with 95% confidence.

184

11. Sampling is done without replacement. The sample size is 2450, which is quite large. If the total population is all workers, though, there would be millions, and so the sample would be ≤ 5% of the population. n p̂ = 2450(0.43) = 1053.5 n q̂ = 2450(1 – 0.43) = 1396.5 Both are ≥ 10, so we can proceed. For a 99% confidence level, the z-score is 2.576.

pˆ ± ( critical z−score)

pˆ qˆ n

(0.43)(0.57 ) 2450 0.43 ± 2.576(0.010002041) 0.43 ± 2.576

(0.4042, 0.4558) A 99% confidence interval estimate for the proportion of workers who phone in sick when they are not ill is (0.4042, 0.4558). It is hard to assess the reliability of these results. Would people tell the truth when they were asked such a question? There are a number of reasons why they might not. 13. We are told the sample data appear approximately normal, so we will assume the population data are normal. n = 40 x = $543.21 s = $47.89 For 95% confidence, we need to identify t.025 for 39 degrees of freedom. Since there is no row in the table for 39 degrees of freedom, we will use the row for 40 degrees of freedom. We see that t.025 = 2.021.

⎛ s ⎞ ⎟⎟ x ± (critical t −score)⎜⎜ ⎝ n ⎠ ⎛ 47.89 ⎞ ⎟⎟ 543.21 ± (2.021)⎜⎜ ⎝ 40 ⎠ 543.21 ± 15.30316127 ($527.91, $558.51) A 95% confidence interval estimate for the average monthly rent for students at this college is ($527.91, $558.51). Because this interval does not contain $500, there is sufficient evidence to reject the claim that the average monthly rent is $500, with 5% significance.

185

15. First we have to organize the data. We can use Excel’s Histogram tool to organize the data, and then produce the following table.

Customer Survey for an Ice Cream Store Which flavour would you like to try, if any? Response Pecan and Fudge Apple Pie Banana Caramel Ripple Ginger and Honey Would Not Try Any Of These Flavours Total

Frequency 37 16 29 32 36 150

Relative Frequency 24.7% 10.7% 19.3% 21.3% 24.0%

Confidence Interval Estimate for the Population Proportion Confidence Level (decimal form) Sample Proportion Sample Size n np-‐hat nq-‐hat Are np-‐hat and nq-‐hat >=10? Upper Confidence Limit Lower Confidence Limit

0.95 0.1933 150 29 121

0.256531268 0.130135399

A 95% confidence interval estimate for the percentage of customers who would not try any of the new flavours is (0.1301, 0.2565).

186

17. The data appear to be normally distributed.

No. of Customers Renting a Car in the 8 a.m.-‐ 10 a.m. Period at a Car Rental Agency 18 16

Number of Days

14 12 10 8

6 4 2 0

Number of Customers

Confidence Interval Estimate for the Population Mean Do the sample data appear to be normally distributed? Confidence Level (decimal form) Sample Mean Sample Standard Deviation s Sample Size n

yes

Upper Confidence Limit Lower Confidence Limit

0.99 21.48 3.56422 50 22.8308 20.1292

187

19. For 98% confidence, the z-score is 2.326. HW = $10 s = 69.861423

⎛ (z − score)(s) ⎞ n = ⎜ ⎟ HW ⎝ ⎠

⎛ (2.326)(69.861423) ⎞ = ⎜ ⎟ 10 ⎝ ⎠ = 264.05

A sample size of 265 is necessary to estimate the annual maintenance costs of this entry-level compact in the 3rd year of its life to within $10, with 98% confidence. 21. confidence level is 95%, so z-score = 1.96 HW = 0.05

p̂ = 0.47 Substitute these values into the formula:

⎛ z − score ⎞ n = p̂q̂⎜ ⎟ ⎝ HW ⎠

⎛ 1.96 ⎞ n = (0.47 )(1 − 0.47 )⎜ ⎟ ⎝ 0.05 ⎠ = (0.2491)(1536.64)

= 382.77

A sample size of 383 is required to estimate the proportion of accounts receivable that are 0-30 days old, to within 5%, with 95% confidence.

188

Chapter 9 Solutions Develop Your Skills 9.1 1. First, calculate differences.

Worker 1 2 3 4 5 6 7 8 9 10

Average Daily Production Before Music 18 14 10 11 9 10 9 11 10 12

Average Daily Production After Music 18 15 12 15 7 11 6 14 11 12

Difference 0 -1 -2 -4 2 -1 3 -3 -1 0

Differences appear normally distributed, although as usual, this can be hard to determine with small sample sizes.

Production Before and After Music is Played in the Plant Number of Workers

5 4 3 2 1 0 (Average Daily Production Before Music) -‐ (Average Daily Production After Music) H 0: µ D = 0 H 1: µ D < 0 (The order of subtraction is production before the music – production after the music is played. If music increases productivity, the production before the music should be lower than after the music is played, so the average difference would be negative.) α = 0.04

189

Calculations can be done by hand, with Excel functions and the template, or with the Data Analysis tool. A completed template is shown below (mean, standard deviation, and sample size were computed with Excel functions).

yes 2.11082 -‐0.7 10 0 -‐1.0487 0.16083 0.32166

We are not given the data set, but some summary data. We cannot check for normality of differences. We proceed by assuming the differences are normally distributed, and noting that our conclusions may not be valid if this is not the case. H 0: µ D = 0 H 1: µ D ≠ 0 (The order of subtraction is before – after. The alternative hypothesis concerns a difference in daily sales before and after the script change, either positive or negative.) α = 0.05 We are given: x D = 4.2 sD = 23.4 nD = 56

xD −µD 4 .2 − 0 = = 1.343 sD 23.4 56 nD

190

Since the p-value > 5%, we fail to reject H0. There is insufficient evidence to suggest there is a difference in daily sales by the telemarketers before and after the script change. 5.

We are told the histogram of differences appears to be normally distributed. We are given summary data. H 0: µ D = 0 H 1: µ D > 0 (The order of subtraction is fuel consumption without checking tires – fuel consumption checking tires. If checking the tires improves fuel consumption (that is, reduces it), then these differences would tend to be positive. α = 0.04 We could do question by hand or with the Excel template.

yes 1.4 0.4 20 1.27775 0.10836 0.21673

191

Develop Your Skills 9.2 7. We have seen a similar problem in Develop Your Skills 9.1 Exercise 2, but the data set has changed. Now, the differences are non-normal.

Sales for Gourmet Cookies, Before and After Packaging Redesign 5

Number of Stores

4 3 2

1 0 (Sales After Packaging Redesign) -‐ (Sales Before Packaging Redesign)

The sample size is small. The histogram is not perfectly symmetric, but it does show a somewhat symmetric U-shape, so we will proceed with the WSRST. H 0: H 1:

α = 0.05 Now we must rank the differences (their absolute values) and compute W+ and W-. The table below summarizes.

192

Differences

Absolute Value Of Differences

128.33 -132.83 -88.38 144.51 -114.78 26.46 -61.51 154.1 134.48 -130.72

128.33 132.83 88.38 144.51 114.78 26.46 61.51 154.1 134.48 130.72

Ordered Differences

Ranks To Be Assigned

Ranks For Positive Differences

26.46 61.51 88.38 114.78 128.33 130.72 132.83 134.48 144.51 154.1 sums

1 2 3 4 5 6 7 8 9 10 55

Ranks For Negative Differences

2 3 4 5 6 7 8 9 10 W+ = 33

W- = 22

Number of Hours Studied Over a Four-‐Week Period 30 Number of Student Pairs

25 20 15 10 5 0 (Hours Studied by Male Students) -‐ (Hours Studied by Female Students)

193

Wilcoxon Signed Rank Sum Test Calculations sample size 100 W+ 2189 W2861

Since sample size is large, at 100, we can use the Excel template based on the normal approximation to the sampling distribution for this test.

Making Decisions About Matched Pairs, Quantitative Data, Non-‐Normal Differences (WSRST) Sample Size 100 Is the sample size at least 25? yes Is the histogram of differences symmetric? yes W+ 2189 W-‐ 2861 z-‐Score 1.15527715 One-‐Tailed p-‐Value 0.12398848 Two-‐Tailed p-‐Value 0.24797695 This is a one-tailed test, so the p-value is 0.124. We fail to reject H0. There is insufficient evidence to suggest that male students study less than female students.

194

Develop Your Skills 9.3 11. H0: p = 0.5 (half the cola drinkers prefer Cola A, half prefer the other brand) H1: p ≠ 0.5 (there is a difference in preferences for Cola A and the other brand) α = 0.05 nST = 16 -1 = 15 n+ = 9, so n- = 6 P(n+ ≥ 9, n = 15, p = 0.5) = 1 – P(n+ ≤ 8) = 1 - 0.696 = 0.304 Since this is a two-tailed test, the p-value = 2 • 0.304 = 0.608. We fail to reject H0. There is insufficient evidence to suggest there is a difference in preferences for Cola A and the other brand. 13. First, analyze the differences in the ratings. Analyst 1 2 3 4 5 6 7 8 9 10 11

Rating for North America 3 2 4 3 3 2 3 3 3 2 4

Rating for Europe 4 3 2 2 1 3 2 2 4 1 4

Differences + + + + + + 0

H0: p = 0.5 (the ratings by analysts for the North American and European economies are the same) H1: p ≠ 0.5 (there is a difference in ratings for the North American and European economics by all analysts) α = 0.03 nST = 11 -1 = 10 n+ = 6, n- = 4 P(n- ≤ 4, nST = 10, p = 0.5) = 0.377 This is a two-tailed test, so p-value = 2 • 0.377 = 0.754. We fail to reject H0. There is insufficient evidence to suggest that there is a difference in the ratings for the North American and European economies by analysts. 15. H0: p = 0.5 (potential customers are equally ready to buy an HDTV before and after seeing an ad about HDTV’s) H1: p > 0.5 (potential customers are more ready to buy an HDTV after seeing an ad about HDTV’s; p is the proportion of potential customers more likely to buy an HDTV after seeing the ad) α = 0.05 First we use the Non Parametric Tool, and the Sign Test Calculations, to analyze the data.

Sign Test Calculations # of non-zero differences # of positive differences # of negative differences

132 47 85

195

132 47 85 0.0006 0.0012

The two different approaches will usually, but not always, lead to the same conclusion. It is harder to reject the null hypothesis with the Wilcoxon Signed Rank Sum Test, and this is why the t-test of µD is preferred, if the necessary conditions are met. Remember, the Wilcoxon Signed Rank Sum Test works with the ranks of the values, not the actual values, and so it gives up some of the information available in the sample data.

196

H0: p = 0.5 (students rate the two designs the same) H1: p ≠ 0.5 (students rate the two designs differently) α = 0.025 Sampling is done without replacement. We do not know the total number of students at the college. As long as there are 8,000 or more, the sample of 400 will be less than about 5% of the population, and we can use the binomial distribution. nST = 400 – 27 = 373 > 20, so the sampling distribution of p̂ will be approximately normal

⎛ 207 ⎞ ⎜ ⎟ − 0.5 pˆ − p pˆ − p ⎝ 373 ⎠ z= = = = 2.12 σ pˆ pq (0.5)(0.5) n ST 373 p-value = 2 • P(z ≥ 2.162) = 2 • (1- 0.9830) = 2 • (0.0170 = 0.034 > α Fail to reject H0. There is not enough evidence to suggest that the students rate the two designs differently. 9.

H 0: µ D = 0 H 1: µ D > 0 (The order of subtraction is (price for job in wealthy neighbourhood – price for job in run-down neighbourhood). If the contractors charge more in the wealthier neighbourhoods, the differences will be positive.) α = 0.05 We are told to assume the differences are normally distributed. We are given: x D = 1262 sD = 478 nD = 10 The Excel template is shown below.

yes 478 1262 10 0 8.34894 7.9E-‐06 1.6E-‐05

The p-value is very small. Reject H0. There is sufficient evidence to suggest that the contractors charge higher prices in wealthier neighbourhoods.

197

11. H0: p = 0.5 (diners rate the two salads the same) H1: p > 0.5 (diners rate the mixed green salad higher, where p is defined as the proportion of diners who prefer the mixed green salad) α = 0.03 Sampling is done without replacement. We do not know the total number of diners at the restaurant. As long as there are 700 or more, the sample of 35 will be less than about 5% of the population, and we can use the binomial distribution. nST = 35 -3 = 32 We could use the sampling distribution of p̂ here, but the approximation will not be that good, because the sample size is fairly small. Instead we will use the Excel template. Making Decisions About Matched Pairs, Ranked Data (Sign Test) Number of Non-‐Zero Differences Number of Positive Differences Number of Negative Differences One-‐Tailed p-‐Value Two-‐Tailed p-‐Value

32 20 12 0.10766 0.21533

This is a one-tailed test. The p-value is 0.108. Fail to reject H0. There is insufficient evidence to suggest that diners prefer the mixed green salad. 13. There is sufficient evidence to reject the hypothesis that the tasks are completed in the same time with the two programs. There is sufficient evidence, at the 5% level of significance, to suggest that there is a difference in the amount of time it takes to complete tasks with the two programs. The new software would be recommended. The interval (3.9 minutes, 14.3 minutes) probably contains the average reduction in time on task with the new software. 15. H0: µD = 0 H1: µD < 0 (for order of subtraction (new business before training – new business after training) α = 0.025 We are told we can assume the differences are normally distributed. First, calculate the differences.

Staff Member Shirley Tom Janice Brian Ed Kim

Monthly New Business Before Training ($000s) $230 $150 $100 $75 $340 $500

Monthly New Business After Training ($000s) $240 $165 $90 $100 $330 $525

Difference -$10 -$15 $10 -$25 $10 -$25

198

Using standard formulas, we calculate: x D = -$9.16667 sD = 15.942605 nD = 6

x D − µ D − 9.166673 − 0 = = −1.408 sD 15.942605 6 nD

We refer to the t-table, looking at the row of critical values for 5 degrees of freedom. Since t.100 = 1.476, we know P(t ≤ -1.408) > 0.10. Fail to reject H0. There is insufficient evidence to infer that monthly new business increased after the training. 17. With non-normal differences, we must use the Wilcoxon Signed Rank Sum Test. Staff Member Shirley Tom Janice Brian Ed Kim

Difference -$10 -$15 $10 -$25 $10 -$25

Absolute Value Of Differences 10 15 10 25 10 25

Ordered Differences 10 10 10 15 25 25 sums

Ranks To Be Assigned 1 2 3 4 5 6 21

Ranks For Positive Differences 2 2

Ranks For Negative Differences

W+ =4

2 4 5.5 5.5 W- =17

Because of the order of subtraction, we expect W- to be the largest rank sum, which it is. p-value = P(W ≥ W-) = P(W ≥ 17) From the table, we see P(W ≥ 18) = 0.078, so P(W≥ 17) > 0.078. Fail to reject H0. There is insufficient evidence to suggest that monthly new business increased after the training. 19. H0: µD = 0 H1: µD < 0 (for order of subtraction (completion time with new-style drill – completion time with oldstyle drill)) α = 0.05 We are told to assume the differences in completion times are normally distributed. We are given x D = -5.2 sD = 12.2 nD = 20

x D − µ D − 5.2 − 0 = = −1.91 sD 12.2 20 nD

We refer to the t-table, looking at the row with n-1= 20-1 = 19 degrees of freedom. We see that 1.91 is located between t.050 and t.025. This is a one-tailed test. 0.025 < p-value < 0.05

199

Reject H0. There is sufficient evidence to suggest that task completion times with the new-style drill are shorter than with the old-style drill. A histogram of the differences is shown below.

Differences in Salaries of Business and Computer Studies Graduates 14 12 Frequency

21.

10 8 6 4 2 0

(Salary of Business Graduate) -‐ (Salary of Computer Studies Graduate)

The differences are not perfectly normally distributed. However, the sample size is fairly large, so we will continue with the t-test. H 0: µ D = 0 H 1: µ D ≠ 0 (The order of subtraction is Business salary – Computer Studies salary.) α = 0.025 We can use Excel functions and the template, as shown below.

yes 4684.629 -‐$ 533.333 30 0 -‐0.623569 0.268893 0.537785

The two-tailed p-value is 0.538 > 0.025. Fail to reject H0. There is insufficient evidence to suggest that salaries of Business grads are different from salaries of Computer Studies grads.

200

23. As usual, with a small data set, it is difficult to assess normality. One possible histogram is shown below.

The histogram is somewhat skewed to the left, but we will assume normality and proceed. The completed Excel template is shown below.

yes 18.1491 13.1304 23.00 0 3.46966 0.00109 0.00218

201

25. Because these are matched pairs of ranked data, we will use the Sign Test. H0: p = 0.5 (employees rate the two presidents the same) H1: p > 0.5 (employees rate the new president higher than the old president) α = 0.04 We can use the Non Parametric Tools Add-In, the Sign Test Calculations, to analyze the ratings. The output is shown below.

Sign Test Calculations # of non-‐zero differences # of positive differences # of negative differences

8 6 2

We can then use the Sign Test template to complete the hypothesis test.

8 6 2 0.14453125 0.2890625

We see that the one-tailed p-value is 0.145 > 0.04. Fail to reject H0. There is not enough evidence to conclude that employees rate the new president higher than the old president.

202

Chapter 10 Solutions Develop Your Skills 10.1 1. Call the defects on the night shift population 1, and the defects on the day shift population 2. H 0: µ 1 - µ 2 = 0 H 1: µ 1 - µ 2 > 0 α = 0.05 x 1 = 35.4, x 2 = 27.8, s1 = 15.3, s2 = 7.9, n1 = 45, n2 = 50 We are told that the population distributions of errors are normal.

( x1 − x 2 ) − (µ1 − µ 2 ) s12 s22 + n1 n2 (35.4 − 27.8) − 0

15.32 7.9 2 + 45 50 = 2.992 Degrees of freedom: minimum of (n1 – 1) and (n2 – 1), so minimum (44, 49) = 44. Closest row in the table is for 45 degrees of freedom. p-value < 0.005 Reject H0. There is sufficient evidence to infer that the number of defects is higher on the night shift than on the day shift, on average. Using the Excel template, we find a more exact p-value of 0.00196.

yes 15.3 7.9 35.4 27.8 45 50 0 2.99245 0.00196 0.00393

203

Use the Excel template to construct the confidence interval. Of course, this can also be done manually with the formula

s 12 s 22 (x 1 − x 2 ) ± t − score + . n1 n 2 Confidence Interval Estimate for the Difference in Population Means Do the sample data appear to be normally distributed? yes Sample 1 Standard Deviation 9.23721 Sample 2 Standard Deviation 4.8948 Sample 1 Mean $32.43 Sample 2 Mean $27.82 Sample 1 Size 15 Sample 2 Size 14 Confidence Level (decimal form) 0.95 Upper Confidence Limit 10.2621 Lower Confidence Limit -‐1.052

With 95% confidence, we estimate that the interval (-1.05, 10.26) contains the true average difference in the purchase of females, compared to males, at this drugstore. We expect this interval to contain zero, since we failed to reject the hypothesis that the difference was zero in Exercise 2. 5.

Call the listening times of the listeners aged 25 and younger, population 1, and the listening times of the listeners over 25, population 2. H 0: µ 1 - µ 2 = 0 H 1: µ 1 - µ 2 ≠ 0 α = 0.05 x 1 = 256.8, x 2 = 218.3, s1 = 50.3, s2 = 92.4, n1 = 30, n2 = 35

( x 1 − x 2 ) − (µ 1 − µ 2 ) s 12 s 22 + n1 n 2 (256.8 − 218.3) − 0

50.3 2 92.4 2 + 30 35 = 2.125 Degrees of freedom, for the by-hand method: minimum (29, 34) = 29. 0.010 • 2 < p-value < 0.025 • 2 0.020 < p-value < 0.05 Since p-value is < α = 0.05, reject H0. There is sufficient evidence to infer that listening habits differ (in terms of average listening time), by age.

204

Develop Your Skills 10.2 7. H0: there is no difference in the locations of the populations of distances travelled by the current best-selling golf ball and the new golf ball H1: the population of distances travelled by the current best-selling golf ball is to the left of the population of distances travelled by the new golf ball α = 0.05 First, we must check the histograms for normality.

Distances Travelled by Current Best-‐Selling Golf Ball 5

Frequency

4 3 2 1

0 Metres

Distances Travelled by New Golf Ball 4

Frequency

3 2 1 0

Metres

Both histograms are non-normal, but they are similar in shape and spread, so we proceed with the Wilcoxon Rank Sum Test. We will use Excel to analyze these data.

205

Wilcoxon Rank Sum Test Calculations sample 1 size sample 2 size W1 W2

12 15 221 157

We will use the template for the Wilcoxon Rank Sum Test for independent samples.

206

We are told the distributions of weight-loss are non-normal, and that both are skewed to the right, so there is similarity in shape of the distributions. No indication is given of the spread of the data, so we will assume similar spreads, noting that our conclusions may not be valid if this is not the case. We are given the rank sums, and can proceed manually, or use the Excel template. The completed Excel template is shown below. Of course, you could also do this calculation with the formulas.

The Excel template is preferred because the t-score will be more accurate than the one used for the manual calculation.

207

(x 1 − x 2 ) ± t − score

s 12 s 22 . + n1 n 2

This question can be done manually with the formula, or with the Excel template. The completed template is shown below.

Confidence Interval Estimate for the Difference in Population Means Do the sample data appear to be normally distributed? yes Sample 1 Standard Deviation 0.6 Sample 2 Standard Deviation 1.3 Sample 1 Mean 2.2 Sample 2 Mean 2.6 Sample 1 Size 55 Sample 2 Size 55 Confidence Level (decimal form) 0.95 Upper Confidence Limit -‐0.0155 Lower Confidence Limit -‐0.7845 We have 95% confidence that the interval (-0.78 hours, -0.02 hours) contains the change in the average amount of time men spend doing unpaid work around the house in 2000, compared with

208

2009. This means that (0.02 hours, 0.78 hours) contains the increase in the average amount of time spend doing unpaid work around the house in 2009 compared with 2000. 11. First, realize these are matched-pairs data. Prices are for the same book each year. (Remember to think about whether you have independent or matched-pairs samples, because the techniques for each are different.) Next check to see if the differences are normally distributed. One possible histogram of differences is shown below.

Frequency

Book Price Comparison 9 8 7 6 5 4 3 2 1 0 (Book Price Last Year) -‐ (Book Price This Year) The histogram is skewed to the right, but somewhat normal in shape. The results of the Data Analysis tool for the t-test are shown below.

t-‐Test: Paired Two Sample for Means

Mean Variance Observations Pearson Correlation Hypothesized Mean Difference df t Stat P(T<=t) one-‐tail t Critical one-‐tail P(T<=t) two-‐tail t Critical two-‐tail

Book Price Book Price Last Year This Year 14.54 12.426 30.37956842 30.895162 20 20 0.879011986 0 19 3.471780487 0.001276846 1.729132792 0.002553692 2.09302405

209

H 0: µ D = 0 H 1: µ D > 0 (The order of subtraction is (book price last year) – (book price this year). If book prices have decreased, this difference would be positive, on average.) α = 0.05 The p-value is 0.001 < 0.05. Reject H0. There is enough evidence to suggest that book prices are lower this year than last year. Note that this is not a random sample. These are books that are of interest to this particular consumer. We should be cautious about drawing a conclusion about all books, based on these data. 13. The completed Excel template is shown below.

Confidence Interval Estimate for the Difference in Population Means Do the sample data appear to be normally distributed? yes Sample 1 Standard Deviation 5.2 Sample 2 Standard Deviation 3.6 Sample 1 Mean 19.2 Sample 2 Mean 12.3 Sample 1 Size 15 Sample 2 Size 20 Confidence Level (decimal form) 0.99 Upper Confidence Limit 11.2948 Lower Confidence Limit 2.50523 We have 99% confidence that the interval (2.5, 11.3) contains the true overestimation of the number of exercises required to master this topic, compared to the actual experience of students. We would not particularly expect this interval to contain zero, since the hypothesis test in Exercise 12 concluded that professors have higher expectations about the number of exercises required to master a topic, compared with students. The 99% confidence interval is wider than the interval that directly corresponds to the hypothesis test in exercise 12 (the tail area there would be 1%; for a 99% confidence interval, there is only ½% in each tail). However, even the wider interval does not contain zero.

210

15. The completed Excel template is shown below.

Confidence Interval Estimate for the Difference in Population Means Do the sample data appear to be normally distributed? yes Sample 1 Standard Deviation 362 Sample 2 Standard Deviation 223 Sample 1 Mean 862 Sample 2 Mean 731 Sample 1 Size 31 Sample 2 Size 25 Confidence Level (decimal form) 0.9 Upper Confidence Limit 263.135 Lower Confidence Limit -‐1.1352 We have 90% confidence that the interval (-1.1, 263.19) contains the difference in the number of pages produced by the two brands of printer cartridge, under these conditions. 17. The completed Excel template is shown below.

Confidence Interval Estimate for the Difference in Population Means Do the sample data appear to be normally distributed? yes Sample 1 Standard Deviation 2.6 Sample 2 Standard Deviation 1.9 Sample 1 Mean 8.5 Sample 2 Mean 6.5 Sample 1 Size 34 Sample 2 Size 36 Confidence Level (decimal form) 0.95 Upper Confidence Limit 3.09397 Lower Confidence Limit 0.90603 We have 95% confidence that the interval (0.91 minutes, 3.09 minutes) contains the true extra average wait time for support for the ITM computers, compared with the Dull computers. This confidence interval corresponds directly to the two-tailed hypothesis test in Exercise 16. Since the null hypothesis of no difference was rejected there, we would not expect this confidence interval to contain zero (and it does not).

211

19. The completed Excel template is shown below.

212

Daily Foot Traffic at Location 1

Number of Days

12 10 8 6 4 2 0

Number of People

Number of Days

Daily Foot Traffic at Location 2 9 8 7 6 5 4 3 2 1 0

Number of People

213

Number of Days

Daily Foot Traffic at Location 3 10 9 8 7 6 5 4 3 2 1 0

Number of People

Annual Salaries of Marketing Graduates 7

Number of Graduates

6 5 4 3 2 1 0 Annual Salary

214

Number of Graduates

Annual Salaries of Accounting Graduates 9 8 7 6 5 4 3 2 1 0

Number of Graduates

Annual Salary

8 7 6 5 4 3 2 1 0

Annual Salaries of Human Resources Graduates

Annual Salary

Number of Graduates

Annual Salaries of General Business Graduates 7 6 5 4 3 2 1 0 Annual Salary

215

Develop Your Skills 11.2 7. H0: µ1 = µ2 = µ3 H1: At least one µ differs from the others. α = 0.05 nT = 150, n1 = 50, n2 = 50, n3 = 50, k = 3 x1 = 77.5684, x2 = 119.6708, x3 = 132.4674 2

s12 = 652.9145, s22 = 555.0899, s3 = 625.7846 SSbetween = 82504.4210, SSwithin = 89855.6606 We have already checked for normality and equality of variances. F = 67.5 The F-distribution has 2, 147 degrees of freedom. Excel provides a p-value of approximately zero. Reject H0. There is sufficient evidence to conclude that customers in different age groups make different average purchases. 9.

H 0: µ 1 = µ 2 = µ 3 H1: At least one µ differs from the others. α = 0.05 nT = 30, n1 = 10, n2 = 10, n3 = 10, k = 3 x1 = 47, x2 = 34.6, x3 = 48.7 2 s12 = 78.4444, s22 = 67.1556, s3 = 94.0111

SSbetween = 1184.8667, SSwithin = 2156.5 We have already checked for normality and equality of variances. F = 7.4 The F-distribution has 2, 27 degrees of freedom. Excel provides a p-value of 0.0027. Reject H0. There is sufficient evidence to conclude that the average commuting time for at least one of the routes is different from the others. The Excel output is shown below.

216

Anova: Single Factor SUMMARY Groups Route 1 Route 2 Route 3

Count 10 10 10

ANOVA Source of Variation SS Between Groups 1184.86667 Within Groups 2156.5 Total

3341.36667

Sum Average Variance 470 47 78.44444 346 34.6 67.15556 487 48.7 94.01111

MS F P-‐value 2 592.4333 7.417436 0.002708 27 79.87037 29

Develop Your Skills 11.3 11. Completed Excel templates are shown below. For locations 1 and 3:

Tukey-‐Kramer Confidence Interval Was the null hypothesis rejected in the ANOVA test? x-‐bar i x-‐bar j ni

yes

nj q (from Appendix 7) MSwithin Upper Confidence Limit Lower Confidence Limit

28 3.4 360.682607 -‐11.4505171 -‐36.0812289

50.5556 74.3214 27

217

For locations 2 and 3:

Tukey-‐Kramer Confidence Interval Was the null hypothesis rejected in the ANOVA test? x-‐bar i x-‐bar j ni

yes

nj q (from Appendix 7) MSwithin Upper Confidence Limit Lower Confidence Limit

28 3.4 360.682607 -‐5.72364915 -‐29.719208

Tukey-‐Kramer Confidence Interval Was the null hypothesis rejected in the ANOVA test? x-‐bar i x-‐bar j ni

yes

nj q (from Appendix 7) MSwithin Upper Confidence Limit Lower Confidence Limit

30 3.4 360.682607 6.06771106 -‐18.1565999

56.6000 74.3214 30

For locations 1 and 2:

50.5556 56.6000 27

The first two confidence intervals do not contain zero, so it appears that the average number of people passing by location 3 is greater than at the other two locations.

218

13. Completed Excel templates are shown below (to save space, the row checking for rejection of the null hypothesis in ANOVA is not shown). Marketing and Accounting:

Tukey-‐Kramer Confidence Interval x-‐bar i x-‐bar j ni nj

q (from Appendix 7) MSwithin

Upper Confidence Limit Lower Confidence Limit

51395.000 71170.000 20 20 3.74 105998118.4210530 -‐11164.94982 -‐28385.05018

Accounting and General:

Tukey-‐Kramer Confidence Interval x-‐bar i x-‐bar j ni nj

q (from Appendix 7) MSwithin

Upper Confidence Limit Lower Confidence Limit

71170.000 53885.000 20 20 3.74 105998118.4210530 25895.05018 8674.949822

Accounting and Human Resources:

Tukey-‐Kramer Confidence Interval x-‐bar i x-‐bar j ni nj

q (from Appendix 7) MSwithin

Upper Confidence Limit Lower Confidence Limit

71170.000 56100.000 20 20 3.74 105998118.4210530 23680.05018 6459.949822

219

Marketing and Human Resources:

Tukey-‐Kramer Confidence Interval x-‐bar i x-‐bar j ni nj

q (from Appendix 7) MSwithin

Upper Confidence Limit Lower Confidence Limit

51395.000 56100.000 20 20 3.74 105998118.4210530 3905.050178 -‐13315.05018

At this point, no further comparisons are necessary. Since this interval contains zero, there does not appear to be a significant difference between the average salaries of Marketing graduates and Human Resources graduates. The differences between the sample means for all other pairs are smaller than for this pair, and so we know there will not be a significant difference for the other pairs. To summarize: We have 95% confidence that the interval • ($-28,385.05, $-11,164.95) contains the average difference in the salaries of Marketing graduates, compared to Accounting graduates (in other words, the average salary of Accounting graduates is likely at least $11,164.95 higher) • ($8,674.95, $25,895.05) contains the average difference in the salaries of Accounting graduates, compared to General Business graduates • ($6,459.95, $23,680.05) contains the average difference in the salaries of Accounting graduates, compared to Human Resources graduates. The differences between the average salaries of Human Resources, General Business, and Marketing graduates are not significant. 15. We have to be careful NOT to answer this question merely by inspection! First we recall that the Ftest for ANOVA indicated a rejection of the null hypothesis. We have sample evidence that the population means are not all the same. The completed Excel templates are shown below. For assigned quizzes and sample tests only:

Tukey-‐Kramer Confidence Interval Was the null hypothesis rejected in the ANOVA test? x-‐bar i x-‐bar j ni

yes

nj q (from Appendix 7) MSwithin Upper Confidence Limit Lower Confidence Limit

45 3.36 218.900673 23.455099 8.63378989

70.1111 54.0667 45

220

Tukey-‐Kramer Confidence Interval Was the null hypothesis rejected in the ANOVA test? x-‐bar i x-‐bar j ni

yes

nj q (from Appendix 7) MSwithin Upper Confidence Limit Lower Confidence Limit

45 3.36 218.900673 20.8328768 6.01156767

70.1111 56.6889 45

Tukey-‐Kramer Confidence Interval Was the null hypothesis rejected in the ANOVA test? x-‐bar i x-‐bar j ni

yes

nj q (from Appendix 7) MSwithin Upper Confidence Limit Lower Confidence Limit

45 3.36 218.900673 10.0328768 -‐4.78843233

56.6889 54.0667 45

221

We have evidence that assigning quizzes for marks results in the best average marks for students. However, as we cautioned before, we cannot be certain of the cause-and-effect relationship here, because there are many potentially confounding variables. Chapter Review Exercises 1. The histograms appear approximately normal, although there is some skewness in each one. However, with the large sample sizes, it is not unreasonable to assume the normality requirements are met. 3.

The missing values are shown below in bold type.

SUMMARY Groups Class #1 Class #2 Class #3 ANOVA Source of Variation Between Groups Within Groups Total 5.

Count 95 95 95 SS 5596.133333 129367.8526 134963.986

Sum 5840 5088 6075 df 2 282 284

Average 61.47368421 53.55789474 63.94736842 MS 2798.066667 458.7512505

Variance 370.0179171 590.6535274 415.5823068 F 6.099311258

Because of the balanced design, these calculations simplify to:

( x i − x j ) ± q−score

MS within n

458.7512505 95 ( x i − x j ) ± 7.273691 ( x i − x j ) ± 3.31

222

We have 95% confidence that the interval (0.6, 15.2) contains the difference between the average marks of Class 1 and Class 2. In other words, it appears that the average marks of those with the Class 1 professor are at least 0.6 percentage points higher than the average mark for those with the Class 2 professor. For Class 1 and Class 3: (61.4737– 63.9474) ± 7.273691 (-9.7, 4.8) In this case, the interval contains zero, and so there does not appear to be a significant difference between the average marks of those with the Class 1 professor and those with the Class 3 professor. From these comparisons, it appears that the average marks are lower for the Class 2 professor`s classes, and so this class should be avoided. There is no significant difference between the average marks for Class 1 and Class 3. The choice should then be: any professor but the one who lead Class 2. However, this is not a valid method of choosing classes, because there could be many explanations for why the Class 2 marks were significantly lower. It could have to do with the teacher`s expertise, and evaluation methods. But it could also have arisen because of other factors: the students in Class 2 might have been less well-prepared, they may have worked more, or had family responsibilities that prevented them from studying, the class times might have been inconvenient, etc. 7.

SUMMARY Groups Employee 1 Employee 2 Employee 3 Employee 4 ANOVA Source of Variation Between Groups Within Groups Total

Count 35 37 32 42 SS 264.6295 1621.124 1885.753

Sum 404 462 357 377 df 3 142 145

Average 11.54286 12.48649 11.15625 8.97619 MS 88.20984 11.41637

Variance 6.314286 14.75676 10.32964 13.536 F 7.726613

223

The completed Excel templates are shown below. Employee 4 and Employee 2:

Tukey-‐Kramer Confidence Interval Was the null hypothesis rejected in the ANOVA test? x-‐bar i x-‐bar j ni nj q (from Appendix 7) MSwithin Upper Confidence Limit Lower Confidence Limit

yes 8.97619048 12.4864865 42 37 3.68 11.4163655 -‐1.52792567 -‐5.49266635

Tukey-‐Kramer Confidence Interval Was the null hypothesis rejected in the ANOVA test? x-‐bar i x-‐bar j ni nj q (from Appendix 7) MSwithin Upper Confidence Limit Lower Confidence Limit

yes 8.97619048 11.5428571 42 35 3.68 11.4163655 -‐0.55440966 -‐4.57892367

We have 95% confidence that the interval (-4.5, -0.5) contains the number of minutes by which the average time spent with customers before making a sale for Employee 4 differs from the average time

224

spent by Employee 1. In other words, we expect the average time spent by Employee 4 is at least 0.5 minutes less than Employee 2. Employee 4 and Employee 3:

Tukey-‐Kramer Confidence Interval Was the null hypothesis rejected in the ANOVA test? x-‐bar i x-‐bar j ni nj q (from Appendix 7) MSwithin Upper Confidence Limit Lower Confidence Limit

yes 8.97619048 11.15625 42 32 3.68 11.4163655 -‐0.11699421 -‐4.24312484

Tukey-‐Kramer Confidence Interval Was the null hypothesis rejected in the ANOVA test? x-‐bar i x-‐bar j ni nj q (from Appendix 7) MSwithin Upper Confidence Limit Lower Confidence Limit

yes 12.4864865 11.15625 37 32 3.68 11.4163655 3.45272548 -‐0.79225251

Since t his interval contains zero, we conclude there is no significant difference between the average number of minutes Employees 2 and 3 spend with customers before making a sale.

225

At this point, we can conclude that there are no significant differences between the average number of minutes Employees 1, 2 and 3 spend with customers before making a sale (the differences in the sample means are all less than the difference for Employees 2 and 3). This means that the average amount of time spent by Employee 4 is less than the average amount of time spent by the other employees. 11. The Excel output is shown below.

Anova: Single Factor SUMMARY Groups Number of Accidents, Training Method #1 Number of Accidents, Training Method #2 Number of Accidents, Training Method #3

Count

Sum

Average

Variance

281 9.366667 8.309195

331 11.03333 9.757471

362 12.06667 16.47816

ANOVA Source of Variation Between Groups Within Groups

SS 111.3556 1001.8

Total

1113.156

MS F P-‐value 2 55.67778 4.835263 0.010205 87 11.51494 89

H 0: µ 1 = µ 2 = µ 3 H1: At least one µ differs from the others. α = 0.025 nT = 90, n1 = 30, n2 = 30, n3 = 30 x1 = 9.3667, x2 = 11.0333, x3 = 12.0677 2

s12 = 8.3092, s22 = 9.7575, s3 = 16.4782, SSbetween = 55.6778, SSwithin = 11.5149 We have already checked for normality and equality of variances. F = 4.835 Excel provides a p-value of 0.010205. Reject H0. There is sufficient evidence to conclude that at the average number of factory accidents is different, according to the training method. However, we

226

cannot be certain that it is the training method that caused these differences. There may be other factors involved. 13. Histograms of the sample data show significant skewness for some of the connection times. The data for early morning and late afternoon connection times appear skewed to the right, and the connection times for the evening are skewed to the left. Sample sizes are also relatively small. As a result, it would probably not be wise to proceed with ANOVA here, as the required conditions do not appear to be met.

Connection Times to Online Mutual Fund Account

Frequency

8 6 4 2 0 Times in Seconds, Late Afternoon

Frequency

Connection Times to Online Mutual Fund Account 10 9 8 7 6 5 4 3 2 1 0 Times in Seconds, Evening

227

Frequency

Connection Times to Online Mutual Fund Account 9 8 7 6 5 4 3 2 1 0 Times in Seconds, Early Afternoon

Frequency

Connection Times to Online Mutual Fund Account 9 8 7 6 5 4 3 2 1 0 Times in Seconds, Early Morning

Frequency

Connection Times to Online Mutual Fund Account 9 8 7 6 5 4 3 2 1 0 Times in Seconds, Mid-‐Day

228

15. First, check conditions. The data are not actually random samples, but could perhaps be considered to be (see the explanation in the exercise). Histograms of the data are shown below.

Classes Scheduled at 8 a.m. Thursday

Frequency

9 8 7 6 5 4 3 2 1 0

Final Grade

Classes Scheduled at 4 p.m. Friday 12

Frequency

10 8 6 4 2 0

Final Grade

Classes Scheduled at 2 p.m. Wednesday

Frequency

10 8 6 4 2 0

Final Grade

The histograms appear reasonably normal. The Excel ANOVA output is shown below.

229

Anova: Single Factor SUMMARY Groups Marks of Class Scheduled for 8 a.m. Thursdays Marks of Class Scheduled for 4 p.m. Fridays Marks of Class Scheduled for 2 p.m Wednesday

Count

Sum

Average

Variance

1257

1650 71.73913 305.2016

1691

ANOVA Source of Variation Between Groups Within Groups

SS 845.314 18142.74

Total

18988.06

62.85 268.0289

67.64

263.99

MS F P-‐value 2 422.657 1.514253 0.22763 65 279.1192 67

We can see from the output that the variances are sufficiently similar to allow us to assume the requirements for ANOVA are met (population variances approximately equal). H 0: µ 1 = µ 2 = µ 3 H1: At least one µ differs from the others. α = 0.01 nT = 78, n1 = 20, n2 = 23, n3 = 25, k = 3 x1 = 62.85, x2 = 71.74, x3 = 67.64 2 s12 = 268.03, s22 = 305.20, s3 = 263.99

SSbetween = 845.31, SSwithin = 18142.74 We have already checked for normality and equality of variances. F = 1.514 Excel provides a p-value of 0.23. Fail to reject H0. There is not enough evidence to conclude that the mean grades for the students in classes for all three schedules are not equal. It does not appear that the scheduled time for classes affects the marks. However, we should be cautious, because there are many other factors that could be affecting marks. If we could control for them, we would be in a better position to investigate the effects of class schedule on student grades.

230

17.

H 0: µ 1 = µ 2 = µ 3 = µ 4 = µ 5 = µ 6 H1: At least one µ differs from the others. α = 0.05 nT = 270, n1 = 45, n2 = 45, n3 = 45, n4 = 45, n5 = 45, n6 = 45, k = 6 x1 = 23.46, x2 = 27.50, x3 = 34.84, x4 = 35.65, x5 = 36.60, x6 = 26.05 2

s12 = 106.43, s22 = 83.10, s3 = 57.77, s 42 = 147.78, s 52 = 121.01, s 62 = 78.81 SSbetween = 7179.961, SSwithin = 26175.49 We have already checked for normality and equality of variances. F = 14.5 Excel provides a p-value of approximately zero. Reject H0. There is enough evidence to conclude that the mean purchases of customers in different age groups are not all equal, when we consider the most recent purchases of those who entered the contest. 19. This question has already been answered, in the discussion of exercise 16. We proceeded, for practice, but these data do not represent a random sample of data about the drugstore customers.

231

p̂ 1 =

85 78 = 0.85 , n1 = 100, p̂ 2 = = 0.78 , n2 = 100 100 100

n 1q̂1=100(1-0.85) = 100(0.15) = 15 > 10 n 2 p̂ 2 =100(0.78) = 78 > 10 n 2 q̂ 2 =100(1-0.78) = 100(0.22) = 22 > 10 Since the null hypothesis is that there is no difference in the proportions, we can pool the sample data to estimate p̂ .

p̂ =

85 + 78 = 0.815 100 + 100

We calculate the z-score as:

(p̂1 − p̂ 2 ) − 0 ⎛ 1 1 ⎞ ⎟⎟ p̂q̂⎜⎜ + ⎝ n 1 n 2 ⎠

(0.85 − 0.78) − 0 1 ⎞ ⎛ 1 (0.815)(1 − 0.815)⎜ + ⎟ ⎝ 100 100 ⎠

0.07 = 1.2747 0.054913568

p-value = P(z ≥ 1.27) = 1 – 0.8980 = 0.102 Since p-value > α, fail to reject H0. There is insufficient evidence to infer that the proportion of ontime flights decreased after the merger. 3.

232

H1: p1 – p2 < -0.10 α = 0.05

n 1q̂1=240(1-0.825) = 42 > 10 n 2 p̂ 2 =350(0.943) = 330.05 > 10 n 2 q̂ 2 =350(1-0.943) =19.95 > 10 The null hypothesis is that there is 10% difference in the proportions, so we cannot pool the sample data to estimate p̂ . We calculate the z-score as:

(pˆ1 − pˆ2 ) − µ pˆ1 − pˆ2 pˆ1qˆ1 pˆ2qˆ2 + n1 n2

(0.825 − 0.943) − (−0.10) = −0.655 (0.825)(0.175) (0.943)(0.057) + 240 350

We will use Excel to calculate this confidence interval (it could also be done by hand, based on the information acquired in Exercise 4 above). The Excel template is shown below.

233

0.9 0.18667 0.275 150 200 28 122 55

n2 • q2hat Are np and nq >=10?

145 yes

Upper Confidence Limit Lower Confidence Limit

-‐0.0146 -‐0.1621

With 90% confidence, we estimate that the interval (-0.162, -0.015) contains the true difference in the proportion of customers who buy the extended warranty, when told about it by the cashier, compared with being exposed to a prominent display at the checkout. Another way to say this: a greater proportion of those exposed to the display bought the extended warranty. We estimate the difference to be contained in the interval (1.5%, 16.2%). This confidence interval corresponds to the hypothesis test in the preceding exercise. Since we rejected the hypothesis of no difference, we would not expect the confidence interval to contain zero (and it does not). Develop Your Skills 12.2 7. H0: The distribution of customers’ brand preferences is as claimed by the previous manager. H1: The distribution of customers’ brand preferences is different from what was claimed by the previous manager. α = 0.05 (given) Expand the table of claimed brand preferences to show expected and observed values, as shown below.

Claimed Preference Expected (for 85 Customers) Observed

Labatt Blue 33%

Labatt Blue Light 8%

Molson Canadian 25%

Kokanee 19%

Rickard’s Honey Brown 15%

28.05

6.8

21.25

16.15

12.75

All expected values are more than 5, so we can proceed.

234

X2 =Σ

(oi − ei ) 2 ( 29 − 28.05) 2 (6 − 6.8) 2 ( 21 − 21.25) 2 (16 − 16.15) 2 (13 − 12.75) 2 = + + + + ei 28.05 6.8 21.25 16.15 12.75

= 0.1355 Degrees of freedom = k – 1 = 4. Using the tables, we see that p-value > 0.100. Using CHITEST, we see that p-value = 0.9978 . Fail to reject H0. There is insufficient evidence to infer that the distribution of customers’ brand preferences is different from what the previous manager claimed. 9.

Past Preferences Expected (for a sample of 54) Observed

Canada

U.S.

Caribbean

Europe

Asia

Other

Australia /New Zealand 3%

28%

32%

22%

12%

15.12

17.28

11.88

6.48

1.08

1.62

0.54

Past Preferences Expected (for a sample of 54) Observed

Canada 28%

U.S. 32%

Caribbean 22%

Europe 12%

Other 6%

15.12

17.28

11.88

6.48

3.24

235

The final table of expected and observed values is shown below.

Past Preferences Expected (for a sample of 54) Observed

Canada

U.S.

Caribbean

28%

32%

22%

Europe, Asia, Australia/New Zealand, Other 18%

15.12

17.28

11.88

9.72

Now that all expected values are ≥ 5, we can proceed. Using the formula as before, we calculate X2 = 5.028. Using the tables, with 3 degrees of freedom, we see p-value > 0.100. Using CHITEST, we see that p-value = 0.1697. Fail to reject H0. There is insufficient evidence to infer that there has been a change in customer destination preferences at this travel agency. Develop Your Skills 12.3 11. H0: There is no relationship between the views on the proposed health benefit changes and the type of job held in the organization H1: There is a relationship between the views on the proposed health benefit changes and the type of job held in the organization α = 0.01 The calculations of expected values for a contingency table can be done manually, but are somewhat tedious. We will use Excel’s Non-Parametric Tool for Chi-Squared Expected Value Calculations. The Excel output is shown below.

Chi-Squared Expected Values Calculations Chi-squared test statistic 16.44338 # of expected values <5 0 p-value 0.002478

Management Professional, Salaried Clerical, Hourly Paid

in favour 19.32377 53.72951 41.94672

opposed 15.62705 43.45082 33.92213

undecided 6.04918 16.81967 13.13115

(The Excel output will allow you to check your manual calculations.) We see that there are no expected values < 5, so we can proceed. The p-value is 0.002478, which is < α = 0.01. Reject H0. There is sufficient evidence to infer that there is a relationship between the views on the proposed health benefits changes and the type of job held in the organization. 13. H0: There is no relationship between household income and the section of the paper read most closely. H1: There is a relationship between household income and the section of the paper read most closely. α = 0.25

236

The output of the Excel tool for Chi-Squared Expected Value Calculations is shown below.

Chi-Squared Expected Values Calculations Chi-squared test statistic 51.92698 # of expected values <5 0 p-value 1.74E-08

Household Income Under $40,000 $40,000 to $70,000 over $70,000

National and World News 48.96498 52.58171 41.45331

Business 40.74708 43.75681 34.49611

Sports 46.56809 50.00778 39.42412

Arts 19.1751 20.59144 16.23346

Lifestyle 20.54475 22.06226 17.393

All expected values are ≥ 5, so we can proceed. The p-value is 0.0000000174, which is extremely small. Reject H0. There is evidence of a relationship between household income and the section of the paper read most closely. 15. H0: The proportions of students drawn from inside or outside the local area are the same for the Business, Technology and Nursing programs at a college. H1: The proportions of students drawn from inside or outside the local area are different for the Business, Technology and Nursing programs at a college. α = 0.025 The output of the Excel tool for Chi-Squared Expected Value Calculations is shown below. Chi-Squared Expected Values Calculations Chi-squared test statistic 0.823106 # of expected values <5 0 p-value 0.662621

From local area Not from local area

Business 68.46154 81.53846

Technology 45.64103 54.35897

Nursing 63.89744 76.10256

H0: p1 – p2 = - 0.10

237

H1: p1 – p2 < -0.10 This may not be immediately obvious. Remember, the subscript 1 corresponds to last year's results, and the subscript 2 corresponds to this year's results. If the proportion of people who pass this year is more than 10% higher, then when we subtract p1-p2, we will get a negative number, and it will be to the left of -0.10 on the number line. 5.

Repeated tests on the same data set lead to higher chances of Type I error, and are therefore not reliable. A Chi-square test allows us to compare all three proportions simultaneously.

Since the Chi-square test is equivalent to the test of proportions, we expect to get the same answer. First, set up the appropriate contingency table for the data, as shown below. Still working out in first six months Taking fitness classes Working with a personal trainer

38 60

Quit working out in first six months 22 20

The setup of the problem is the same, with the same null and alternative hypotheses. The output of the Excel tool for Chi-Squared Expected Value Calculations is shown below.

Chi-Squared Expected Values Calculations Chi-squared test statistic # of expected values <5 p-value

2.222222 0 0.136037

Still working out in first six months Taking fitness classes Working with a personal trainer

Quit working out in first six months 42

pˆ 1 =

234 232 = 0.841726618 , n1 = 278, pˆ 2 = = 0.76821192 , n2 = 302 278 302

238

acceptance packages. This means a fairly large potential first-year enrolment. This is an assumption that we should note before we proceed to use the binomial distribution as the underlying model. Check for normality of the sampling distribution: n 1 p̂1= 234 > 10

n 1q̂1= 278 - 234 = 44 > 10 n 2 p̂ 2 = 232 > 10 n 2 q̂ 2 = 302 - 232 = 70 > 10 Since the null hypothesis is that there is no difference in the proportions, we can pool the sample data to estimate p̂ .

pˆ =

234 + 232 = 0.80344828 278 + 302

We calculate the z-score as:

( pˆ 1 − pˆ 2 ) − 0 ⎛ 1 1 ⎞ pˆ qˆ ⎜⎜ + ⎟⎟ ⎝ n1 n2 ⎠

(Note that some proportions are left in fractional form to preserve accuracy for calculations with a calculator.) p-value = P(z ≥ 2.23) = 1 - 0.9871 = 0.0129 Since p-value < α, reject H0. There is sufficient evidence to infer that the proportion of prospective students who send acceptances is higher when they get calls from program faculty (compared with receiving a package in the mail). 11. Call the data on managers who have been sent to conflict resolution training sample 1 from population 1. Call the data on non-managerial employees who have been sent to conflict resolution training sample 2 from population 2. H 0: p 1 – p 2 = 0 H 1: p 1 – p 2 ≠ 0 α = 0.025

p̂ 1 =

36 38 = 0.72 , n1 = 50, p̂ 2 = = 0.50 , n2 = 76 50 76

239

Check for normality of the sampling distribution: n 1 p̂1= 36 > 10

n 1q̂1= 50 - 36 = 14 > 10 n 2 p̂ 2 = 38 > 10 n 2 q̂ 2 = 76 - 38 = 38 > 10 Since the null hypothesis is that there is no difference in the proportions, we can pool the sample data to estimate p̂ .

p̂ =

36 + 38 = 0.587301587 50 + 76

We calculate the z-score as:

(p̂1 − p̂ 2 ) − 0 ⎛ 1 1 ⎞ p̂q̂⎜⎜ + ⎟⎟ ⎝ n 1 n 2 ⎠

(0.72 − 0.50) − 0 74 ⎞⎛ 1 1 ⎞ ⎛ 74 ⎞⎛ ⎜ ⎟⎜1 − ⎟⎜ + ⎟ ⎝ 126 ⎠⎝ 126 ⎠⎝ 50 76 ⎠

= 2.45

Defective Components Non-Defective Components Total

Manufacturer #1 36 89 125

Manufacturer #2 30 95 125

Manufacturer #3 38 87 125

240

Chi-Squared Expected Values Calculations Chi-squared test statistic 1.383764 # of expected values <5 0 p-value 0.500633

Defective Components Non-Defective Components

#1 34.66667 90.33333

#2 34.66667 90.33333

#3 34.66667 90.33333

All the expected values are ≥ 5, so we can proceed. The p-value is 0.5 > α = 0.05. Fail to reject H0. There is not enough evidence to infer that the proportions of defective items are different among the three manufacturers. 15. Exercise 14 could also be done as a Chi-square test. First, organize the data as shown below. Plant 1 Employees Who Had An Accident Employees Who Had No Accident Total

23 127 150

Plant 2 23 102 125

H0: The proportions of employees who had accidents are the same at the two plants. H1: The proportions of employees who had accidents are different at the two plants. α = 0.05 The output from the Excel tool for Chi-Squared Expected Value Calculations is shown below.

Chi-Squared Expected Values Calculations Chi-squared test statistic 0.460335 # of expected values <5 0 p-value 0.497468

Employees Who Had An Accident Employees Who Had No Accident

Plant 1 25.09091 124.9091

Plant 2 20.90909 104.0909

We arrive at the same conclusion as before (as we would expect). Once again, the p-value is 0.497. Since p-value > α, fail to reject H0. There is insufficient evidence to infer there is a difference in the proportions of employees who had accidents at the two plants. 17. H0: The absences are equally distributed across the five working days of the week. H1: The absences are not equally distributed across the five working days of the week. α = 0.05 There are 48 absences in total, in the sample. If the absences are equally distributed across the five working days of the week, then we would expect each of the five days to have 48/5 = 9.6 absences.

X2 =Σ

(oi − ei ) 2 (15 − 9.6) 2 (6 − 9.6) 2 (4 − 9.6) 2 (7 − 9.6) 2 (16 − 9.6) 2 = + + + + = 12.625 ei 9.6 9.6 9.6 9.6 9.6

p-value = P(X2 > 12.625)= 0.013261 (using Excel’s CHITEST).

241

Using the table, for four degrees of freedom, we see 0.010 < P(X2 > 12.625) < 0.025. Reject H0. There is enough evidence to suggest that the absences are not equally distributed across the five working days of the week. 19. H0: The proportions of workers who travel to work via the different methods are the same for the software firm and the accounting firm. H1: The proportions of workers who travel to work via the different methods at the software firm are different from the proportions of workers who travel to work via the different methods at the accounting firm. α = 0.05 The output of the Excel tool for Chi-Squared Expected Value Calculations is shown below. Chi-Squared Expected Values Calculations Chi-squared test statistic n/a # of expected values <5 2 p-value n/a By Transit 50.8481 52.1519

Software Firm Accounting Firm

In Car 15.3038 15.6962

On Bicycle 9.873418 10.12658

On Foot 1.974684 2.025316

Software Firm Accounting Firm

By Transit 51 52

In Car 8 23

On Bicycle Or On Foot 19 5

Software Firm Accounting Firm

By Transit 50.8481 52.1519

In Car 15.3038 15.6962

On Bicycle Or On Foot 11.8481 12.1519

242

Chapter 13 Solutions Develop Your Skills 13.1 1. The scatter diagram is shown below.

Hendrick Software Sales

y = 6.6519x + 4.7013

140

Total Sales ($000)

120 100 80 60 40 20

0 5

Number of Sales Contacts

243

A scatter diagram is shown below.

Smith and Klein Manufacturing

y = 30.21x -‐ 148770

$1,600,000 $1,400,000 $1,200,000

Sales

$1,000,000 $800,000 $600,000 $400,000 $200,000 $0 $0

$10,000

$20,000

$30,000

$40,000

$50,000

Promotion Expenditure

The least-squares regression line is: annual sales = 30.21(annual promotion spending) - $148,770 Interpretation: Each new dollar in promotion spending results in an increase in annual sales of approximately $30.21. The y-intercept should not be interpreted, since the sample data did not contain any observations of $0 annual promotion spending. Because of the way the researcher has posed the question, the response variable is revenues, and the explanatory variable is the number of employees. The scatter diagram is shown below:

Top 25 Global Research Organizations, 2007 Global Research Revenues (US$ Millions)

y = 0.1338x + 140.56

5,000 4,500 4,000 3,500 3,000 2,500 2,000

1,500 1,000 500 0 0

5,000

10,000 15,000 20,000 25,000 30,000 35,000 Full-‐Time Employees

244

Monthly Spending on Restaurant Meals

Spending on Restaurant Meals and Income y = 0.0241x + 44.903 $250 $200 $150 $100

$50 $0 $1,000 $1,500 $2,000 $2,500 $3,000 $3,500 $4,000 $4,500 Monthly Income

Monthly Income Residual Plot 150

Residuals

100

50 0 -‐50 -‐100

-‐150 $-‐

$1,000

$2,000

$3,000

$4,000

$5,000

Monthly Income

245

Residuals for Model of Restaurant Spending and Monthly Income

Frequency

25 20 15 10 5 0 Residual

A check of the scatter diagram and the standardized residuals reveals six points that could be considered outliers. They are circled on the scatter diagram below.

Monthly Spending on Restaurant Meals

Spending on Restaurant Meals and Income $250 $200 $150 $100

$50 $0 $1,000 $1,500 $2,000 $2,500 $3,000 $3,500 $4,000 $4,500 Monthly Income

246

Monthly Spending on Restaurant Meals

Spending on Restaurant Meals and Income $250 $200 $150 $100

$50 $0 $1,000 $1,500 $2,000 $2,500 $3,000 $3,500 $4,000 $4,500 Monthly Income

To investigate, each point is removed from the data set, to see the effect on the least-squares regression line. The least-squares line for the original sample data set was y = 0.0241x + 44.903. Without the circled point on the right-hand side, the equation changes to y = 0.0214x + 50.639, which is not that much of a change, relatively speaking. Similarly, the outlier at (1258.97, 154.68) could be having a large effect on the least-squares line. Removing it changes the equation to y = 0.0262x + 39.292, which has more of an effect. Still, neither point appears to be affecting the regression relationship by a large amount (relatively speaking). However, at this point in the analysis, it would be useful to go back to the beginning. It does not appear that monthly income is a strong predictor of monthly restaurant spending. There is too much variability in the restaurant spending data, for the various income levels, for us to develop a useful model.

247

With the two erroneous data points removed, the scatter diagram looks as shown below.

Hours of Work and Semester Marks Semester Average Mark

y = -‐0.144x + 89.175

100 90 80 70 60 50 40 30 20 10 0 0

100

200

300

400

Total Hours at Paid Job During Semester

The relationship appears to be linear. The residual plot is shown below.

Total Hours at Paid Job During Semester Residual Plot 15 10 5

Residuals

0 -‐5 0

100

200

300

400

-‐10 -‐15 -‐20

Total Hours at Paid Job During Semester

The residuals appear centred on zero, with fairly constant variability, although variability seems greatest in the middle of the range of hours worked. There is no indication that the residuals are dependent. A histogram of the residuals is shown below.

248

Residuals for Semester Mark and Hours of Work Data

Frequency

12 10

8 6 4

2 0

Residual

The histogram is quite normal in shape. A check of the standardized residuals does not reveal any that are ≤ -2 or +2, although there is one observation with a standardized residual of -1.99. This is the observation (72, 65). [If we could, we would check this data point to make sure that it is accurate.] This point is quite obvious in both the scatter diagram and residual plot (the point is circled in these two graphs). There are no obvious influential observations, except perhaps for the almost-outlier. Removing this point from the data set does not affect the least squares regression line significantly. Despite the one troublesome point, the data set does appear to meet the requirements of the theoretical model. Develop Your Skills 13.3 11. Since the sample data meet the requirements, it is acceptable to proceed with the hypothesis test. H0: β1 = 0 (that is, there is no linear relationship between the number of sales contacts and sales) H1: β1 > 0 (that is, there is a positive linear relationship between the number of sales contacts and sales) α = 0.05 From the Excel output, t = 7.64 The p-value is 9.38E-08, which is very small. The p-value for the one-tailed test is only half of this value, and is certainly < α. In other words, there is almost no chance of getting sample results like these, if in fact there is no linear relationship between the number of sales contacts and sales. Therefore, we can (with confidence), reject the null hypothesis and conclude there is evidence of a positive linear relationship between the number of sales contacts and sales data for the Hendrick Software Sales Company. 13. Since the sample data do not meet the requirements of the theoretical model, it is not appropriate to conduct a hypothesis test. 15. Since the sample data do not meet the requirements of the theoretical model, it is not appropriate to conduct a hypothesis test.

249

Develop Your Skills 13.4 17. The R2 value for this data set is only 0.18. This is not surprising, because the scatter diagram of the relationship revealed scarcely any perceivable pattern. Only 18% of the variation in monthly spending on restaurant meals is explained by income. Earlier investigations suggested this model was not worth pursuing, and the low R2 value reinforces that. 19. The R2 value, at 0.72, suggests that 72% of the variation in semester average marks is explained by hours spent working during the semester. (Note that this is for the amended data set, where the two erroneous grades have been removed—see Develop Your Skills 13.2, Exercise 9). Obviously, there are many factors that affect semester average marks, for example, ability, study habits, past educational experience, and so on. If the original data were collected in a truly random fashion, these factors may have been randomized. It seems reasonable to conclude that students who work less will have more time for their studies, and it seems reasonable to think that marks improve with time spent studying. However, this data set does not guarantee that reducing work will lead to improved marks. Develop Your Skills 13.5 21. Since the requirements are met, it is appropriate to create a confidence interval. The Excel output is shown below (in two parts, to better fit on the page).

Confidence Interval and Prediction Intervals -‐ Calculations Point 98% = Confidence Level (%) Number Number of Sales Contacts 1 10

Prediction Interval Confidence Interval Lower limit Upper limit Lower limit Upper limit 44.96826 97.471443 66.068659 76.37104 With 98% confidence, the interval ($66,069, $76,371) contains the average sales for 10 sales contacts. 23. Since the requirements are not met, it is not appropriate to create a confidence interval. 25. Since the requirements are not met, it is not appropriate to construct a prediction interval. Chapter Review Exercises 1. The hypothesis test is only valid if the required conditions are met. If you don't check conditions, you may rely on a hypothesis test when it is misleading. 3.

A lower standard error means that confidence and prediction intervals will be narrower. Predictions made with the model will therefore be more useful.

250

Odometer Residual Plot

3000 2000 1000

Residuals

0 -‐1000

20000

40000

60000

80000

100000

120000

-‐2000 -‐3000 -‐4000

Odometer

A histogram of the residuals is shown below. The histogram is not perfectly normally-distributed, but it is approximately so.

Residuals for Honda Civic List Price Model, Based on Odometer 8 7 6 Frequency

5 4 3 2 1 0 Residual

251

Confidence Interval and Prediction Intervals -‐ Calculations Point 95% = Confidence Level (%) Number Odometer 1 50000

Prediction Interval Confidence Interval Lower limit Upper limit Lower limit Upper limit 12683.4909 19607.9242 15259.8312 17031.584 9.

The coefficient of determination for the TSX and the DJI over the first six months of 2009 is 0.72. This measure suggests that 72% of the variation in the TSX is explained by variation in the DJI.

11. A scatter diagram is shown below.

Mark on Final Exam

Student Marks in Statistics y = 0.9586x + 0.4464

100 90 80 70 60 50 40 30 20 10 0 0

100

Mark on Test #2

The estimated relationship is as follows: Mark on final exam = 0.9586 (Mark on Test #2) + 0.4464 In other words, it appears the mark on the final exam is about 96% of the mark on Test #2. 13. Since the sample data meet the requirements, it is acceptable to proceed with the hypothesis test. H0: β1 = 0 (that is, there is no linear relationship between the mark on Test #2 and the final exam mark in Statistics) H1: β1 > 0 (that is, there is a positive linear relationship between the mark on Test #2 and the final exam mark in Statistics) α = 0.05 From the Excel output, t = 16.5

252

The p-value is 2.96E-14, which is very small. The p-value for the one-tailed test is only half of this value, and is certainly < 5%. In other words, there is almost no chance of getting sample results like these, if in fact there is no linear relationship between the mark on Test #2 and the final exam mark in Statistics. Therefore, reject H0 and conclude there is strong evidence of a positive linear relationship between the mark on Test #2 and the final exam mark in Statistics. 15.

A scatter diagram of the data is shown below.

Aries Car Parts $1,000

y = 0.9806x + 25.233

Auditor's Inventory Value

$900

$800 $700 $600 $500

$400 $300 $200 $100

$-‐ $-‐

$200

$400

$600

$800

$1,000

Recorded Parts Inventory Value

If the inventory records are generally accurate, we would expect the slope of the regression line to be very close to 1, as it appears to be. It appears there is a strong positive relationship between the recorded inventory value and the audited inventory value. The relationship is as follows: auditor's inventory value = 0.9806(recorded parts inventory value) + $25.23 17. While we have some concern about the distribution of residuals, we will proceed with the hypothesis test. H0: β1 = 0 (that is, there is no linear relationship between the recorded inventory values and the audited inventory values) H1: β1 ≠ 0 (that is, there is a linear relationship between the recorded inventory values and the audited inventory values) α = 0.05 An excerpt of Excel’s regression output is shown below.

253

SUMMARY OUTPUT Regression Statistics Multiple R 0.995213711 R Square 0.99045033 Adjusted R Square 0.990160946 Standard Error 16.61634358 Observations 35 ANOVA df

SS 944994.372 9111.394836 954105.7668

MS F 944994.372 3422.616936 276.1028738

Intercept

Coefficients Standard Error 25.22708893 8.612571593

t Stat P-‐value 2.929100636 0.006122286

Recorded Parts Inventory Value

0.978281557

58.50313612 6.47389E-‐35

Regression Residual Total

1 33 34

0.016721865

Revenue and Profit for a Random Sample of Top 1000 Canadian Companies, 2008

y = 5.8784x + 478280

$35,000,000 $30,000,000

Profit (000)

$25,000,000 $20,000,000 $15,000,000 $10,000,000 $5,000,000 $0 -‐$5,000,000

-‐$1,000,000

$1,000,000 $2,000,000 $3,000,000 $4,000,000 $5,000,000 Revenue (000)

254

Revenue and Profit for a Random Sample of Top 1000 Canadian Companies, 2008

y = -‐1.3658x + 250301

$2,500,000 $2,000,000

Profit (000)

$1,500,000

$1,000,000 $500,000

$150,000

$100,000

$50,000

-‐$50,000

-‐$100,000

-‐$150,000

-‐$200,000

-‐$250,000

-‐$300,000

-‐$500,000

Revenue (000)

255

Final Average Mark Residual Plot 8 6

Residuals

4 2 0 -‐2

100

-‐4 -‐6 -‐8

Final Average Mark

The residuals appear randomly distributed around zero, with the same variability for all x-values. A histogram of the residuals is shown on the next page.

Residuals for Test Score Model 12

Frequency

10 8 6

4 2 0 Residual

The residuals appear approximately normally distributed. There are no outliers or obviously influential observations in the data set. It appears these data meet the requirements for the linear regression model.

256

23. Since the requirements are met, it is appropriate to create a confidence interval estimate. The Excel output is shown below.

Point Number

98% = Confidence Level (%) Final Average Mark 1 75

Prediction Interval Confidence Interval Lower limit Upper limit Lower limit Upper limit 43.9917459 62.278853 51.4810585 54.78954 With 98% confidence, we estimate that the interval (51.5, 54.8) contains the average test score of graduates with an overall average mark of 75. 25. No, it would not be appropriate to use package weight as a predictor of shipping cost. We can see from the residual plot that variability increases as package weight increases.

257

Chapter 14 Solutions Develop Your Skills 14.1 1. Scatter diagrams are shown below.

Salary ($000)

Salary and Age 100 90 80 70 60 50 40 30 20 10 0 20

Age

Salary ($000)

Salary and Years of Postsecondary Education 100 90 80 70 60 50 40 30 20 10 0 0

Years of Postsecondary Education

Salary ($000)

Salary and Years of Experience 100 90 80 70 60 50 40 30 20 10 0 0

Years of Experience

258

Age and Years of Experience Years of Experience

40 35 30 25 20 15 10 5 0 20

Age

259

An excerpt of the Regression output for the salaries data set with years of postsecondary education and years of experience is shown below.

SUMMARY OUTPUT Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations

0.92720353 0.859706386 0.852122948 6.805249337 40

ANOVA df Regression Residual Total

Intercept Years of Postsecondary Education Years of Experience

2 37 39 Coefficients 20.87563476

2.673930095 1.238703002

260

Develop Your Skills 14.2 7. The residual plots are shown below.

Age Residual Plot 20 15

Residuals

10 5 0

-‐5 0

-‐10 -‐15

-‐20

Age

Years of Postsecondary Education Residual Plot 20

Residuals

10 0

-‐10 -‐20

Years of Postsecondary Education

261

Residuals vs. Predicted Salary

Residual

(Age, Years of Postsecondary Education) 20 15 10 5 0 -‐5 -‐10 -‐15 -‐20 0

100

Predicted Salary (000)

Residuals for Salary Model 14

(Age, Years of Postsecondary Education, Years of Experience)

Frequency

10 8 6 4 2 0 Residual

The histogram is somewhat skewed to the right. It appears to be centred close to zero.

262

A histogram of the residuals for the model discussed in Exercise 7 is shown below.

Residuals for Salary Model 12

(Age, Years of Postsecondary Education)

Frequency

10 8 6

4 2

0 Residual

As for the previous model, we see the histogram is somewhat skewed to the right, and centred approximately on zero. A histogram of the residuals for the model discussed in Exercise 8 is shown below.

Residuals for Salary Model 14

(Years of Postsecondary Education, Years of Experience)

Frequency

12 10 8 6 4 2 0 Residual

As with the others, this histogram appears skewed to the right, but the skewness appears more pronounced here.

263

Develop Your Skills 14.3 11. Adjusted

13.

Test for the significance of the overall model (age and years of postsecondary education ): H 0: β 1 = β 2 = 0 H1: At least one of the βi’s is not zero. α = 0.05 From the Excel output, we see that F = 81.3, and the p-value is approximately zero. There is strong evidence that the overall model is significant. Tests for the significance of the individual explanatory variables: Age:

H 0: β 1 = 0 H 1: β 1 ≠ 0

264

15.

The adjusted R2 values are shown below: Model All Explanatory Variables Years of Postsecondary Education and Age Years of Postsecondary Education and Years of Experience

Adjusted R2 0.85 0.80 0.85

At this point, the model that contains years of postsecondary education and age does not seem worth considering. The adjusted R2 value is lower than for the other models. As we have already seen, age and years of experience are highly correlated, and it appears that the model containing years of experience does a better job. Develop Your Skills 14.4 17. It would not be appropriate to use the Woodbon model to make a prediction for mortgage rates of 6%, housing starts of 2,500, and advertising expenditure of $4,000, because the highest advertising expenditure in the sample data is only $3,500. We should not rely on a model for predictions based on explanatory variable values that are outside the range of the sample data on which the model is based. 19. The Excel output from the Multiple Regression Tools add-in is shown below (in two parts, to fit better on the page):

Confidence Interval and Prediction Intervals -‐ Calculations Point 95 = Confidence Level(%) Number Age Years of Postsecondary Education 1 35 5

Confidence Interval Lower limit Upper limit 44.47730973 51.330924

With 95% confidence, the interval ($44,477, $51,331) contains the average salary of all individuals who are 35 years old, and who have 5 years of postsecondary education. The confidence interval is narrower than the prediction interval from Exercise 18, because the variability in the average salary is less than the variability for an individual salary. 21. The text contains scatter diagrams of Woodbon Annual Sales plotted against mortgage rates and advertising expenditure (see Exhibit 14.2). Each relationship appears linear, with no pronounced curvature. A plot of the residuals versus the predicted y-values for this model is shown below.

265

Woodbon Model, Residuals Versus Predicted Sales (Mortgage Rates and Advertising Expenditure as Explanatory Variables)

20000 15000

Residual

10000 5000 0 -‐5000 -‐10000

-‐15000 -‐20000 0

20000

40000

60000

80000 100000 120000 140000

Predicted Sales Values

The plot shows the desired horizontal band appearance, although there appears to be reduced variability for higher predicted values. The other residual plots are shown below.

The mortgage rates residual plot shows the desired horizontal band appearance.

266

Residuals

Advertising Expenditure Residual Plot 20000 15000 10000 5000 0 -‐5000 -‐10000 -‐15000 -‐20000 $0

$1,000

$2,000

$3,000

$4,000

Advertising Expenditure

Here is the correlation matrix for the variables in the Salaries data set.

Years of Postsecondary Education

Age Age

Years of Experience

Salary (000)

Years of Postsecondary Education Years of Experience Salary (000)

0.318722486 0.971756454 0.861913715

1 0.227538151 0.528597263

1 0.862062768

From this we can see that years of experience and age are very highly correlated, and so we would not choose to include both in our model. Both age and years of experience are very highly correlated with salary, and so one or the other appears to be promising as an explanatory variable. 25. There are many possible models here. However, the one that looks most promising is the one that includes years of experience and years of postsecondary education. This is a logical model. Overall, it is significant, and each of the explanatory variables is significant. The standard error is relatively low. As well, the model makes sense. It is reasonable to expect that both of these factors would have a positive impact on salary. We cannot decide to rely on this model without checking the required conditions. The residual plots are shown below.

267

Residuals vs. Predicted Salary (Years of Postsecondary Education, Years of Experience) 15 5 0 -‐5 -‐10 -‐15 0

100

Predicted Salary

Years of Postsecondary Education Residual Plot 15 10 5 0 -‐5 0

-‐10 -‐15

Years of Postsecondary Education

Years of Experience Residual Plot 15 10

Residuals

5 0 -‐5

-‐10 -‐15

Years of Experience

268

All the residual plots show the desired horizontal band appearance, centred on zero. A histogram of the residuals has some right-skewness.

Residuals for Salary Model (Years of Postsecondary Education, Years of Experience) 14

Frequency

12 10 8 6 4 2 0 Residual

There are no obvious outliers or influential observations. We choose this model as the best available. 27. We used indicator variables as shown in Exhibit 14.27 in the text. The Excel Regression output is as shown below.

SUMMARY OUTPUT Regression Statistics Multiple R 0.404030563 R Square 0.163240695 Adjusted R Square 0.101258525 Standard Error 313.2587914 Observations 30 ANOVA df Regression Residual Total

Intercept Onever Durible

2 27 29

SS 516890.0667 2649538.9 3166428.967

MS F Significance F 258445.0333 2.633671806 0.090179418 98131.07037

269

H 0: β 1 = β 2 = 0 H1: At least one of the βi’s is not zero. α = 0.05 From the Excel output, we see that F = 2.63, and the p-value is about 9%. There is not enough evidence to infer that there is a significant relationship between battery life and brand. Note that this is the same conclusion we came to in Chapter 11. 29. Excel's Regression output for the model including both number of employees and shift is shown below.

SUMMARY OUTPUT Regression Statistics Multiple R 0.40825172 R Square 0.166669467 Adjusted R Square 0.128790806 Standard Error 4165.200265 Observations 47 ANOVA df Regression Residual Total

2 44 46

SS MS F Significance F 152673338.7 76336669 4.400089 0.018112587 763351302.8 17348893 916024641.5

Coefficients Standard Error t Stat P-‐value Lower 95% Intercept 33118.64628 10531.14671 3.144828 0.002976 11894.51498 Number of Employees 246.1711185 83.00625235 2.965694 0.004865 78.88301139 Shift (0=Day, 1=Night) 158.6692673 1282.379223 0.12373 0.902092 -‐2425.796202 It appears the overall model is significant, however, shift is not a significant explanatory variable when the number of employees is included in the model. As well, if the model is run with only shift included as an explanatory variable, it is not significant. Therefore, it appears that shift is not a useful explanatory variable for the number of units produced. As well, the model based on number of employees, while significant, is not a particularly useful model (the adjusted R2 is only 0.15).

270

Age of head of household: H 0: β 1 = 0 H 1: β 1 ≠ 0

271

The two-variable model with the highest adjusted R2 contains Test #2 and Assignment #2. Adding Assignment #2 to Test #2 as an explanatory variable increases the adjusted R2 value from 0.51 to 0.57, and the standard error decreases from 14.3 to 13.4. Prediction and confidence intervals made with the two-variable model would be narrower than for the model with only Test #2. The twovariable model is better, but whether it is "best" depends on the way the model might be used. Suppose it is being used to predict the exam marks, and identify those who are in danger of failing the course, or not achieving a grade level necessary for external accreditation. Test #2 is a significant explanatory variable. If Assignment #2 comes much later in the course, it may be better to use the single-variable model, so that the student can be alerted to a potential problem earlier, with time for adjustments. 11. None of the three-variable models represents a real improvement on the model which includes Test #2 and Assignment #2. As we might expect, the best three-variable models include both Test #2 and Assignment #2. The best of these, in terms of higher adjusted R2, also contains Test #1. However, Test #1 is not significant as an explanatory variable when Test #2 and Assignment #2 are included in the model. This is true for all the other models that include both Test #2 and Assignment #2: the third explanatory variable is not significant when Test #2 and Assignment #2 are included in the model. 13. Using the model that includes Test #2 and Assignment #2, the Excel output is shown below (split for visibility):

Confidence Interval and Prediction Intervals -‐ Calculations Point 95% = Confidence Level (%) Number Assignment #2 Test#2 1 65 70

Prediction Interval Lower limit Upper limit 48.90969082 102.4567362 We have 95% confidence that the interval (48.9, 100) contains the final exam mark of a student who received a mark of 65 on Assignment 2 and 70 on Test 2. After all the analysis, it appears that the best model generates a prediction interval so wide that it is not really useful. 15. Year of the car is not really a quantitative variable. There are four years (2004, 2005, 2006, and 2007) in the sample data set, so three indicator variables are required. They could be set up as follows: Year 2004 2005 2006 2007

Indicator Variable 1 1 0 0 0

Indicator Variable 2 0 1 0 0

Indicator Variable 3 0 0 1 0

272

All possible regressions calculations provide many possible models. However, notice again that there is only one observation for the year 2007. The data set is not really large enough to support this analysis. We will proceed, out of curiosity. The model with the highest adjusted R2 contains kilometres and only the indicator variable specifying whether the car is from the 2004 model year, or not.

Model Number Adjusted R^2 Standard Error K Significance F 5 0.590043715 1564.675734 2 2.11076E-‐05 Variable Labels Coefficients p-‐value Intercept 21641.08247 8.12791E-‐17 Kilometres -‐0.049699268 0.000244684 1=Year 2004, 0=Not Year 2004 -‐2071.672715 0.003726813 The model is as follows: For the model year 2004: List price = $21,641.08 – 0.05(Kilometres) - $2,017.67 For model years 2005, 2006, and 2007: List price = $21,641.08 – 0.05(Kilometres) Notice that this model is more intuitive than the model from Exercise 14, which was: List price = -$2,076,840.42 + $1,045.99(Year) + 0.043(Kilometres) Such a model does not really make sense, and this should have been your clue that treating the year of a car as a quantitative variable is not the correct approach. 17. While it is tempting the add the new data and analyze the model which was best from the analysis we did for Exercise 16, the correct approach is to look at all possible models. We have to allow for the possibility that the new information ALONE will be the basis of the most important explanatory variable. In fact, the output of all possible regressions calculations shows that inclusion of the indicator variable for the location being within a five-minute drive of a major highway does improve the model we chose as best for Exercise 16. However, the best of all of the models, in terms of adjusted R2, is the model with all possible explanatory variables. The adjusted R2 for this model if 0.656, compared with 0.517 for the preferred model in Exercise 16. The data requirements for this model are more onerous, and this would have to be taken into consideration before the model was selected. While local population and median incomes could be obtained through Statistics Canada, information about estimated weekly traffic volume will probably have to be collected (and possibly over several weeks). However, the information about whether a location is within a five-minute drive of a major highway could be obtained by looking at road maps and estimating driving distance. We will analyze the "all-in" model to see if it conforms to required conditions. Residual plots look acceptable. The histogram of residuals appears normally-distributed. There are no obvious outliers or influential observations.

273

Residuals

Local Population Residual Plot 3000 2000 1000 0 -‐1000 -‐2000 -‐3000 0

50,000 100,000 150,000 200,000 250,000 300,000

Local Population

1=Within Five-‐Minute Drive of Major Highway, 0=Otherwise Residual Plot Residuals

4000 2000 0 -‐2000 0 -‐4000

0.2

0.4

0.6

0.8

1.2

1=Within Five-‐Minute Drive of Major Highway, 0=Otherwise

Residuals

Median Income in Local Area Residual Plot 3000 2000 1000 0 -‐1000 -‐2000 -‐3000 40000

50000

60000

70000

80000

90000

Median Income in Local Area

274

Residuals

Estimated Traffic Volume (Weekly) Residual Plot 3000 2000 1000 0 -‐1000 -‐2000 -‐3000 10,000 15,000 20,000 25,000 30,000 35,000 40,000

Estimated Traffic Volume (Weekly)

Doughnut Shop Sales Prediction Model (All Explanatory Variables) 3000

Residual

2000 1000 0 -‐1000 -‐2000

-‐3000 25000

27000

29000

31000

33000

Predicted Monthly Sales

Doughnut Shop Sales Prediction Model (All Explanatory Variables) 12

Frequency

10 8 6 4 2 0 Residual

275

For locations not within a five-minute drive of a major highway: $Predicted Monthly Sales = $17,413 + 0.009(Local Population) + 0.089(Median Income in Local Area) + 0.145(Estimated Weekly Traffic Volume) 19. There are many possible models. However, for many of the models, when the overall model is significant, some of the individual explanatory variables are not significant, given the other explanatory variables in the model. This is not surprising, as the factors that lead to student success in one subject probably contribute to student success in other subjects. The best one-variable model is based on the mark in Intermediate Accounting 1. The best twovariable model includes the mark in Intermediate Accounting 1 and Cost Accounting 1. Model results are summarized below.

Model Number Adjusted R^2 Standard Error K Significance F 1 0.520480342 12.93893991 1 2.05994E-‐09 Variable Labels Coefficients p-‐value Intercept 17.2097272 0.008500616 Intermediate Accounting 1 0.711183518 2.05994E-‐09 Model Number Adjusted R^2 Standard Error K Significance F 6 0.590366643 11.95895263 2 2.92403E-‐10 Variable Labels Coefficients p-‐value Intercept 14.52779938 0.016864759 Intermediate Accounting 1 0.420200699 0.002427925 Cost Accounting 1 0.377202568 0.003952342

Of these two, Model Number 6 appears to be the better model, with a higher adjusted R2, and somewhat lower standard error. 21. We have 95% confidence that the interval (42, 91) contains the Statistics 1 mark of an individual student who achieved a mark of 65 in Cost Accounting 1 and Intermediate Accounting 1.

276