Math 11: Final Semi Study Guide

Page 1

MATH 11: FINAL SEMI-STUDY GUIDE

Professor: Jason Schweinsberg Supplemental Instruction: Juan Djuwadi Quarter: Winter 2017


The Central Limit Theorem (revisited/reminded) No matter the distribution of the population, the distribution of the random sample we take is approximately normal if the sample size is large enough. The population has a mean đ?œ‡ and standard deviation đ?œŽ. The sample mean (đ?‘ŚĚ…) has a distribution centered at the population mean đ?œ‡, with a standard đ?œŽ deviation of đ?‘›. √

The population distribution isn’t normal at all. We then take 1,000 samples of different sizes, find their means and plot them on a histogram.

N=3, N=10, N=30. The larger the sample size the more normal the resulting sample distributions will be! Note: A larger sample size will produce a smaller sampling distribution variance. As sample size gets larger it becomes a better estimator of the population. You try! http://onlinestatbook.com/stat_sim/sampling_dist/


If the sampling distribution is approximately normal, then we can apply the 68-95-99 rule! 68% of data values are within 1 SD of the mean. 95% are within 2 SD of the mean. 99.7% are within 3 SD of the mean.

This rule applies for any data that is Normal. OK – so let’s go back to their means and standard deviations. We know samples have the mean đ?‘ŚĚ… and đ?œŽ the standard deviation of the sampling distribution should be: √đ?‘›

BUT – here in lies a problem. Often, we take samples so we can make conclusions about the population, so we usually have very limited information about the population. It’s likely that we don’t know what the true standard deviation (đ?œŽ) is. How then can we compute the standard deviation of the sampling đ?œŽ distribution with the formula đ?‘›? √

Like with many other things we do in statistics, we estimate it! Using our sample standard deviation from the data (s), we estimate the population standard deviation (đ?œŽ). We then compute the standard đ?‘ đ?œŽ error ( đ?‘›) which becomes the estimator for our sampling standard deviation ( đ?‘›). √

√

s estimates ďƒ đ?œŽ SE (đ?‘ŚĚ…) estimates ďƒ SD (đ?‘ŚĚ…) BUT BUT BUT - when we estimate the standard deviation this way, it creates more variability, especially for smaller samples. Using this method for larger samples (which were approximately normal) worked well but the extra variability in smaller samples influenced the results of the P-values. Therefore, a new distribution was created. T-Distributions For smaller sample sizes (n < 30) or populations where you don’t know đ?œŽ, Gosset’s t-distribution is a better approximator of the sampling distribution than the normal model. The T-distribution is also unimodal and symmetric. But the tails of the distribution are thicker, which can change the area of the curve. Therefore, we can’t use the 68-95-99 rule, at least not for smaller samples (see why below!)


Z to T

In fact there’s a WHOLE family of t-distributions (an infinite amount of them actually). We choose the one that best approximates our sampling distribution by looking at our degrees of freedom. The degrees of freedom are found by taking the size of or sample and subtracting 1: Degrees of Freedom (df): n – 1 As the degrees of freedom gets bigger, the t-distribution slowly morphs into the standard normal distribution. In fact, a t-distribution with an infinite amount of degrees of freedom is exactly normal. This has some implications: the larger the sample size is, the better the sampling distribution is approximated by a normal model, which means the 68-95-99 rule becomes more and more relevant with larger samples. Okay, but what exactly is the Degrees of Freedom? Don’t worry about it. It’s basically the amount of variability given to our observations. For a given mean, every observation can vary except for the last one, hence n – 1.

I don’t get it, what does that mean? Don’t worry about it.

*Why Divide by n – 1 (Don’t Worry About It)* When we find the sample standard deviation to estimate the population standard deviation we divide by n-1 not by n. If we know the population đ?œ‡ we would use it to find the sample standard deviation: ∑(đ?‘Ś − đ?œ‡)2 đ?‘ = √ đ?‘› But we DON’T KNOW the population đ?œ‡ so we use đ?‘ŚĚ… in its place, which causes a problem.


∑(đ?‘Ś − đ?‘ŚĚ…)2 đ?‘ = √ đ?‘›

This is because the sample observations will ALWAYS be closer to its sample mean đ?‘ŚĚ… than to the population mean đ?œ‡ so this implies: (đ?‘Ś − đ?‘ŚĚ…)2 < (đ?‘Ś − đ?œ‡)2 Which means our sample standard deviation is too small. But we fix it by dividing n-1 to increase it. ∑(đ?‘Ś − đ?‘ŚĚ…)2 đ?‘ = √ đ?‘›âˆ’1 Wait, I actually know đ?›”! If you know đ?œŽ, use z (but this is rare). If you use s to estimate đ?œŽ, then use t-distribution.

Assumptions and Conditions for using T-Model Independence Assumption: Data values should be independent between each other. Randomization Condition: This condition is satisfied if the data arise form a random sample or suitably randomized experiment. This also makes sure our samples are independent. Nearly Normal condition: The Data come from a distribution that is unimodal and symmetric. The only way to check this is to make a histogram of the data. When the sample size is larger than 40-50, the t-methods are safe to use unless the data are extremely skewed. If the sample size is less than that but still moderate size (15-40) then t methods can still work well as long as data is unimodal and reasonably symmetric. The larger our sample size gets, the more skewness we allow. Confidence Interval for Means For means, we use Gosset’s t-model with n – 1 degrees of freedom. The mechanics of this confidence interval is not so much different than for proportions. In fact, when working from proportions to means, “we use all the same macro-level ideas‌but we change a few micro-level detailsâ€?. Standardized sample mean: đ?‘Ą=

đ?‘ŚĚ… − đ?œ‡ đ?‘†đ??¸(đ?‘ŚĚ…)

Follows a T-MODEL with n-1 degrees of freedom. We estimate standard deviation with đ?‘†đ??¸(đ?‘ŚĚ…) =

đ?‘ √đ?‘›


When using T-Models your confidence intervals will be bit wider and your P-values will be a bit larger. This corrects for the uncertainty. Using a t-model compensates for the extra variability. One-Sample Confidence Interval For the Mean Confidence interval for mean đ?œ‡. The confidence interval is: ∗ đ?‘ŚĚ… Âą đ?‘Ąđ?‘›âˆ’1 ∙ đ?‘†đ??¸(đ?‘ŚĚ…)

This is a t-distribution with n-1 degrees of freedom.

Where the standard error of the mean is SE (đ?‘ŚĚ…) =

đ?‘ . √đ?‘›

Remember we don’t know đ?œŽ so we use the

standard error to approximate the standard deviation of the sampling distribution. ∗ Critical value is: đ?‘Ąđ?‘›âˆ’1 which depends on the confidence level and on the number of degrees of freedom you get from the sample size. (The critical values are NOT z* !).

T-models are unimodal, symmetric, and bell-shaped just like the normal, but the t-models with only a few degrees of freedom have longer tails and a larger standard deviation than normal, so the margins of error are bigger ďƒ Smaller samples have more variability, as you increase your sample size, your SE goes đ?‘ down: đ?‘› √

T-models with infinite degrees of freedom is exactly normal – but after a hundred degrees of freedom it’s hard to tell the difference.


Using T-Table to Find T-Values If you can’t find a row for the degree of freedom you need, just use the next smaller degree of freedom in the table. Our sample size is 25. The degree of freedom is 24. We are searching for a confidence interval of 90%. ∗ Our t-critical value (đ?‘Ą24 : 1.711)

Meaning: Our observation is 1.711 Standard Errors away from the mean.

To find critical value, locate the row of the table corresponding to the degrees of freedom and the column corresponding to the probability you want. A 90% confidence interval leaves 5% of values on either side so look for 0.05 at top column or 90% at the bottom. Computed t-statistic can take on any value so value is likely not found on the table. Best we can do is to trap a calculated t-value to find where the t-statistic falls. P-value will lie between the two values at the heads of the columns. REMEMBER – Unlike the normal table with z-critical values which found the probability of everything to the left of the observation. The t-table gives us the probabilities to the right of the critical value. ďƒ&#x; P- value from T-Table

ďƒ&#x; P-Value from Z-Table


Interpreting the Confidence Interval DON’T SAY: “90% of all samples will have mean sleep between 6.272 and 7.008 hours per night”  This interval is no more (or less) likely to be correct than any other. You could say that 90% of all possible samples will produce intervals that do contain the true mean sleep, the PROBLEM with this, however, is that we’ll never know what the true mean sleep is so we can’t know if our sample was one of those 90%. DO SAY: “90% of all intervals that could be found this way would cover the true value” OR “we are 90% confidence that the true population lies in-between the corresponding boundaries of the margins of error (6.272 and 7.008). Confidence interval in proper context: When we construct confidence intervals in this way, we expect 90% of the confidence intervals that we make to cover the true mean and 10% to miss the true value. You’re making a statement about where the true mean might lie and not about other sample means. Our uncertainty is about the INTERVAL, not the true mean – interval varies randomly. The true mean is not variable (just unknown).


Chapter 20, Q7) The housing market has recovered slowly from the economic crisis of 2008. Recently, in one large community, realtors randomly sampled 36 bids from potential buyers to estimate the average loss in home value. The sample showed the average loss was $9560 with a standard deviation of $1500. a) What assumptions and conditions must be checked before finding a confidence interval? How would you check them?

b) Find a 95% confidence interval for the mean loss in value per home.

c) Interpret this interval and explain what 95% confidence means in this context.


Chapter 20, Q29) What are the chances your flight will leave on time? The US Bureau of Transportation Statistic publishes information about airline performance. Here are a histogram and summary statistics for the percentage of flights departing on time each month from 1994 through 2013.

a) Check the assumptions and conditions for inference.

b) Find a 90% Confidence interval for the true percentage of flights that depart on time.

c) Interpret this interval for a traveler planning to fly.


What is a Hypothesis Test? So the internet has a lot of really complicated definitions on what Hypothesis testing is but for now it’s fine to think of it simply as a claim – one that we want to test to see if it’s right or wrong. In terms of its scope, the claim is quite broad as it usually makes a statement about a population as whole, maybe about the world, BUT we can’t go around asking all 7.4 billion around – that’s like super expensive and super impractical – and so what we do is we take a random sample of some people which (we hope) is representative of the population. The Actual Hypothesis There’s only two hypotheses that you need to state in every world problem: đ?‘Żđ?’? and đ?‘Żđ?‘¨ which is just the Null and the Alternative Hypothesis. The Null states that NOTHING has changed. The Alternative is all the values of the parameters that we consider reasonable if the null ends up being rejected. If we make a claim, we want to REJECT the NULL, because that means our ALTERNATIVE is plausible. One Sided and Two Sided Tests One sided tests have rejection regions in one tail, while two sided tests have rejection regions in both tails.

REJECT

REJECT REJECT

REJECT

Two-Tail Test

Left-Tail Test

Right-Tail Test

đ??ť0 : đ?œ‡ = 0

đ??ť0 : đ?œ‡ = 0

đ??ť0 : đ?œ‡ = 0

đ??ť0 : đ?œ‡ ≠0

đ??ť0 : đ?œ‡ < 0

đ??ť0 : đ?œ‡ > 0

We use one sided tests when there is a direction to our claim. We use a two-sided test we think there is a change in the mean but we don’t know (or are interested) in which way it deviates. Attending SI could reduce your study time if your SI leader is insightful and competent, but could also increase your study time if he/she is confusing and ignorant – probably the former though right đ&#x;˜ƒ. In this case, we use a two-sided test: we know study time will change but we don’t know which way.


Duh-Heck is a P-Value? Okay, we know by now we need this to conduct our tests, but what is it exactly? P-Value is simply our evidence against a null hypothesis. When we conduct our test, we assume that the null hypothesis is true! So P-Values are conditional probabilities. They give the probability of observing the result we have seen given the null is true:

đ?‘ƒ(đ?‘‚đ?‘?đ?‘ đ?‘’đ?‘&#x;đ?‘Łđ?‘’đ?‘‘ đ?‘†đ?‘Ąđ?‘Žđ?‘Ą đ?‘‰đ?‘Žđ?‘™đ?‘˘đ?‘’ | đ??ťđ?‘œ )

Given that the null is true, the P-value gives us a sense of how surprising it is to see the data we are claiming. THAT MEANS: If our P-Value is small, it’s super unlikely we’d see data we are claiming if our null hypothesis is true! This means something is up! How can observe such a data if it’s “supposed� to be super rare? This means that it’s likely our null isn’t exactly right, and the true proportion lies elsewhere – closer to our claim. OF COURSE: We can’t discount the possibility that we “struck gold� in our test and encountered a random sample that produced this super rare result as a product of the natural variability between random samples. This is why it’s encouraged to conduct more tests or to take larger samples. OK, Cool. But how do we know our P-Value is small enough to reject our Null? How rare is rare?

Alpha Levels The alpha level is the reference point of the P-value that helps us answer that question. It’s related to confidence intervals in that you find your alpha level by subtracting your confidence interval from 100% (So say I want to be 95% confident, my alpha level would be 5%), but don’t worry too much about that. Just know that we always use it as a threshold for our P-Value:

đ?‘…đ?‘’đ?‘—đ?‘’đ?‘?đ?‘Ą đ??ťđ?‘œ : đ?‘–đ?‘“ đ?‘? − đ?‘Łđ?‘Žđ?‘™đ?‘˘đ?‘’ < đ?›ź đ??šđ?‘Žđ?‘–đ?‘™ đ?‘Ąđ?‘œ đ?‘…đ?‘’đ?‘—đ?‘’đ?‘?đ?‘Ą đ??ťđ?‘œ : đ?‘–đ?‘“ đ?‘? − đ?‘Łđ?‘Žđ?‘™đ?‘˘đ?‘’ > đ?›ź


One way you can think of the P-Value is the probability that the Null Hypothesis is true. So, if we set đ?›ź = 5% - which is the default level – then seeing the P-Value less than 5% means that the probability that the Null is true is way too low, so we reject it. But if the P-Value is greater than 5%, than the probability the Null could be true is just too high for us to reject it so we don’t. Note: This isn’t technically correct, we can never prove the Null to be true. Graphically, the P-value is the area in the tail of the test-statistic. One-Tail Test: P-Value = P(t >t ∗n−1) Two-Tail Test: P-Value = 2P(t >t ∗n−1) Statistical Significance? This is just fancy, stat-jargon. If we reject the null, the results are statistically significant. It means that we had a P-Value lower than or alpha level – that’s it!

One Sample T-Test for the Mean The test compares the differences between the observed statistics and the hypothesized value. For means, the probability model is Gosset’s t-distribution with n – 1 degrees of freedom. We test the hypothesis đ??ťđ?‘œ : đ?œ‡ = đ?œ‡đ?‘œ using đ?‘Ąđ?‘›âˆ’1 = And the standard error is SE (đ?‘ŚĚ…) =

đ?‘ŚĚ… − đ?œ‡đ?‘œ đ?‘†đ??¸(đ?‘ŚĚ…)

đ?‘ √đ?‘›

The rest of the procedure is just the same as we have done with all other hypothesis tests.


Let’s Try! Hypothesis Testing for Means Are runners getting faster? The mean time for all runners in the Cherry Blossom Race in 2006 was 93.29 minutes. We pick 15 random racers from the 2015 race to see if they are faster. We find the sample mean was 91, and the sample standard deviation was 6.

We want to know if they’re faster, so this is a one-sided test. A lower time means a faster race, so if the observed statistic is adequately below the mean we can provide evidence for our claim – also implies a left-sided test. Let đ?œ‡ be mean time of all racers in 2015. Our hypothesis is: đ??ťđ?‘œ : đ?œ‡ = 93.29 đ??ťđ??´ : đ?œ‡ < 93.29

The sample size is 15 runners. It’s rather small and we also don’t know the population standard deviation. The normal model is a poor approximator of the sampling distribution, the t-model with 15 -1 = 14 degrees of freedom does a much better job.

Here’s our t-distribution with 14 degrees of freedom. Our alpha is 5%, denoted by the blue area, the critical value for that associated p-value is -2.145.

- 2.145

If the P-value is less than alpha, then we reject the null hypothesis. Since we also found the associated critical value for a 5% alpha level in this distribution, t-critical value greater than 2.145 will immediately tell us that we can reject the null as well. Find the Critical Value: n = 15, df= 14, đ?œ‡= 93.29, đ?‘ĽĚ… = 91, đ?‘ đ?‘Ľ = 6

So đ?‘Ą14 =

91−93.29 6 √15

= - 1.478

From the t-critical value alone we know that there is insufficient evidence since 1.478 < 2.145. The observed statistic is not “crazy� enough to reject our null. Let’s find the p-value anyway. We look at the t-table:


From the table, our critical value lies in between a p-value of 0.10 and 0.05. The p-value is greater than our alpha, which means there is insufficient evidence to reject our claim.

-2.145 -1.478

Connection between Intervals and Tests Confidence intervals and significance test built from same calculations. Here’s the connection: The confidence intervals contains all the null hypothesis values we CAN’T reject with these data. All the null hypothesis values inside the interval have proposed values that are plausible. But null hypothesis values outside the interval (at the tails) are too far away. Summaries and Points •

• •

•

T-statistic tells us how many standard errors our sample mean is from the hypothesized mean. For means, we have to estimate the standard error separately, this added uncertainty changes the model for the sampling distribution from z to t. Randomization applied when drawing a random sample is what generates the sampling distribution and allows for independence. An increase in the standard deviation increases the width of the confidence of interval assuming we maintain the same level of confidence. ∗ đ?‘ŚĚ… Âą đ?‘Ąđ?‘›âˆ’1 đ?‘Ľ đ?‘†đ??¸(đ?‘ŚĚ…) Margin of error is obtained by multiplying the critical value of t with the standard error of a random sample. A lower confidence level will yield a lower critical value of t which lowers the margin of error and lowers the confidence interval.


• • •

If you increase the sample size, the distribution approaches the normal approximation so drawing a bigger sample is more appropriate. If you calculate a smaller confidence interval, the margin of error should be smaller  ME decreases with confidence. The t-critical value will converge to the z-critical value (normality) if the sample size/degrees of freedom is large. T-critical values for the same p-value are usually larger since their tails are fatter but the distribution converges to normal as the sample size/degrees of freedom increases so their critical values converge as well. When you fail to reject the null, there is insufficient evidence to prove that the parameter value is incorrect.

Meme Break:


Chapter 20, Q 37) In 1960, census results indicated that the age at which American men first married had a mean of 23.3 years. It is widely suspected that young people today are waiting longer to get married. We want to find out if the mean age of first marriage has increased during the past 40 years. a) Write appropriate hypotheses

b) We plan to test our hypothesis by selecting a random sample of 40 men who married for the first-time last year. Do you think the necessary assumptions for inference are satisfied? Explain.

c) Describe the approximate sampling distribution model for the mean age in such samples.

d) The men in our sample married at an average age of 24.2 years, with a standard deviation of 5.3 years. What’s the P-value for this result?

e) Explain (in context) what this P-value means?

f)

What’s your conclusion?


Chapter 20, Q 48) Should you generate electricity with your own personal wind turbine? That depends on whether you have enough wind on your site. To produce enough energy, your site should have an annual average wind speed above 8 miles per hour, according to the Wind Energy Association. One candidate site was monitored for a year, with wind speeds recorded every 6 hours. A total of 1114 readings of wind speed averaged 8.019 mph with standard deviation 3.813 mph. You’ve been asked to make a statistical report to help the landowner decide whether to place a wind turbine at this site. a) Discuss the assumptions and conditions for using Student’s t inference methods with these data. Here are some plots that may help you decide whether the methods can be used:

b) What would you tell the landowner about whether this site is suitable for a small wind turbine? Explain.


Confidence Interval for the Difference Between Two Means So, we have two populations with unknown parameters and we take samples from each of them. Each sample generates summary statistics.

We continuously take samples of the same size and generate the following distributions:

But now we ask, what does the distribution of the differences between their means look like? Interestingly, for the difference in two observed means: đ?‘Ś1 − đ?‘Ś2 , this statistic itself has a standard deviation and a sampling distribution that is centered at đ?œ‡1 − đ?œ‡2 . Since both samples have sampling distribution modelled by a t-distribution, the difference in means distribution will also be modelled by a t-distribution with degrees of freedom ≈ min(đ?‘›1 − 1, đ?‘›2 − 1). Note: Computers use a super strange formula to calculate the degrees of freedom. The formula is kind of confusing. Since the sample size of the two samples probably differ, choosing the degrees of freedom is more complicated. The min operator above provides an easier rule. However, this is a conservative estimate: it gives you less degrees of freedom that you are entitled to.

RECALL: For independent random variables, the variance of their difference is the sum of their individual variances: Var (Y – X) = Var(Y) + Var (X) which is referred to as the Pythagorean theorem of statistics. This only works if Y and X are independent, so both samples must be independent.


So, to find the standard deviation of differences between two independent sample means we add variances and take square root: đ?‘†đ??ˇ (đ?‘Ś1 − đ?‘Ś2 ) = √đ?‘‰đ?‘Žđ?‘&#x;(đ?‘Ś1 ) + đ?‘‰đ?‘Žđ?‘&#x; (đ?‘Ś2 ) đ?œŽ2

đ?œŽ2

1

2

= √đ?‘›1 + đ?‘›2

BUT we don’t know the true standard deviations, so we have to use the sample standard deviations (đ?‘ 1 đ?‘Žđ?‘›đ?‘‘ đ?‘ 2 ) to estimate it. Using these estimates give us the standard error: đ?‘ 2

đ?‘ 2

1

2

đ?‘†đ??¸ (đ?‘Ś1 − đ?‘Ś2 ) = √đ?‘›1 + đ?‘›2

Since we are estimating the standard error we are using the t-distribution as the sampling model. The confidence interval is called a two-sample t-interval (for the difference of means). The hypothesis test is called a two-sample t-test. (đ?‘Ś1 − đ?‘Ś2 ) Âą đ?‘€đ??¸ đ?‘€đ??¸ = đ?‘Ą ∗ đ?‘Ľ đ?‘†đ??¸(đ?‘Ś1 − đ?‘Ś2 )

Assumptions and Conditions for Two-Sample T- Confidence Interval Independence: We need independence within the group and independence between the groups. Randomization is evidence of independence. Nearly Normal Condition – you check for both groups. The assumption matters most when the sample sizes are small. For samples of n < 15 in either group, you shouldn’t use these methods if the histograms shows severe skewness. For sample sizes closer to 40, mild skewness is OK, but note outliers. When both groups are bigger than 40, the central limit theorem dictates that they’ll be Near Normal regardless. But should still be wary of extreme skewness and outliers. No statistical test can verify independence. You must think about how the data was collected. Randomization make independence plausible. When you check the Nearly Normal Condition it’s important you look at both histograms to look for normality or skewness. Two-Sample t-Interval for Difference Between Means Confidence interval for the difference between means of two independent groups: đ?œ‡1 − đ?œ‡2 . Confidence interval is: ∗ (đ?‘Ś1 − đ?‘Ś2 ) Âą đ?‘Ąđ?‘‘đ?‘“ đ?‘Ľ đ?‘†đ??¸(đ?‘Ś1 − đ?‘Ś2 )

Where the standard error of the difference of means: đ?‘ 12 đ?‘ 22 đ?‘†đ??¸(đ?‘Ś1 − đ?‘Ś2 ) = √ + đ?‘›1 đ?‘›2


∗ Remember! Critical value đ?‘Ąđ?‘‘đ?‘“ depends on confidence level that you specify and on number of degrees of ∗ freedom. The critical values on the t-distribution is placed where C% of the area lies between −đ?‘Ąđ?‘‘đ?‘“ and ∗ đ?‘Ąđ?‘‘đ?‘“ , where Cis your desired confidence level.

Two-Sample t-Test: Testing for the Difference Between Two Means This test finds the difference between the observed group means and compares this with a hypothesized value for that difference. We want to see the difference between them and we call the hypothesized difference ∆đ?‘œ (đ?‘‘đ?‘’đ?‘™đ?‘Ąđ?‘Ž đ?‘›đ?‘Žđ?‘˘đ?‘”â„Žđ?‘Ą) which equals zero. So we write: đ??ťđ?‘œ : đ?œ‡1 − đ?œ‡2 = ∆đ?‘œ = 0 We compute the difference in means with standard error of the difference. For a difference between independent means, we find P-values from a t-model with the adjusted degrees of freedom. The hypothesized difference is almost always zero, we use the statistic: đ?‘Ą=

(đ?‘Ś1 − đ?‘Ś2 ) − ∆đ?‘œ đ?‘†đ??¸(đ?‘Ś1 − đ?‘Ś2 )

Standard error of đ?‘Ś1 − đ?‘Ś2 is đ?‘ 12 đ?‘ 22 đ?‘†đ??¸(đ?‘Ś1 − đ?‘Ś2 ) = √ + đ?‘›1 đ?‘›2


Chapter 22, Q 53) The Consumer Reports article listed fat content (in grams) for samples of beef and meat hot dogs. The resulting 90% confidence interval for đ?œ‡đ?‘€đ?‘’đ?‘Žđ?‘Ą − đ?œ‡đ??ľđ?‘’đ?‘’đ?‘“ is (-6.5, -1.4). a) The endpoints of this confidence interval are negative numbers. What does that indicate?

b) What does the fact that the confidence interval does not contain 0 indicate?

c) If we use this confidence interval to test the hypothesis that đ?œ‡đ?‘€đ?‘’đ?‘Žđ?‘Ą − đ?œ‡đ??ľđ?‘’đ?‘’đ?‘“ = 0, what’s the corresponding alpha level?


Chapter 22, Q 61) A man who moves to a new city sees that there are two routes that he could take to work. A neighbor who has lived there a long time tells him Route A will average 5 minutes faster than Route B. The man decides to experiment. Each day, he flips a coin to determine which way to go, driving each route 20 days. He finds that Route A takes an average of 40 minutes, with standard deviation 3 minutes, and Route B takes an average of 43 minutes, with standard deviation 2 minutes. Histograms of travel times for routes are roughly symmetric and show no outliers. a) Find 95% confidence interval for the difference in average commuting time for the two routes (from technology, df=33.1)

b) Should the man believe the old-timer’s claim that he can save an average of 5 minutes a day by always driving Route A? Explain.


Chapter 22, Q 77) Researchers investigated how the size of a bowl affects how much ice cream tend to scoop when serving themselves. At an “ice cream social,� people were randomly given either a 17 oz or a 34 oz bowl (both large enough that they would not be filled to capacity). They were then invited to scoop as much ice cream as they liked. Did the bowl size change the selected portion size? Here are the summaries:

Test an appropriate hypothesis and state your conclusions. For assumptions and conditions that you cannot test, you may assume that they are sufficiently satisfied to proceed.


Paired and Sample Blocks Paired data arises in several ways, most common way is to compare subjects with themselves before and after a treatment. Pairs that arise from experiments are called blocking. Pairs that arise from observational studies are called matching. There’s no statistical test to identify paired data, you have to determine that yourself. In paired data, it’s the differences we care about, so we treat the differences of each pair as if they were data themselves. We then use a one-sample t-test. A paired t-test is just a one-sample t-test for the means of these pairwise differences. The sample size is the number of pairs. Assumptions and Conditions Paired Data Condition: Data must be paired! Can’t just pair them when they’re actually independent. Remember, two-sample t methods aren’t valid without independent groups, and paired groups are NOT independent. Condition is easy to check if you understand how the data is collected. Independence Assumption: If data is paired the groups are not independent. It’s the differences that must be independent of each other. To assure independence, generate sample/data with randomization. Nearly Normal Condition: This condition can be checked with a histogram or Normal probability plot of the differences – not of individual groups. Assumptions matters less when we have more pairs to consider, even if your measurements are skewed/bimodal, the differences may be nearly Normal. The Paired t-Test We test the hypothesis that the paired population don’t differ at all. đ??ť0 : đ?œ‡đ?‘‘ = ∆0 The d’s are the pairwise differences (∆0 ) which is always 0. Use the statistic: đ?‘Ąđ?‘›âˆ’1 =

đ?‘‘Ě… − ∆0 đ?‘†đ??¸(đ?‘‘Ě… )

đ?‘‘Ě… is the mean of the pairwise differences, n is the number of pairs and đ?‘†đ??¸(đ?‘‘Ě…) =

đ?‘ đ?‘‘ √đ?‘›

SE is the ordinary standard error for the differences, we can model the sampling distribution with tmodel of n-1 degrees of freedom and use that model to obtain the P-value. For paired data, it’s the Normality of the differences that we care about. Treat those paired differences as you would a single variable, and check the Nearly Normal Condition with a histogram or Normal probability plot. Remember to multiply the P-Value by two for two sided plots.


Confidence Intervals for Matched Pairs When conditions are met, we find confidence intervals for the mean of the paired differences: ∗ đ?‘‘ Âą đ?‘Ąđ?‘›âˆ’1 đ?‘Ľ đ?‘†đ??¸(đ?‘‘Ě…)

Where the standard error of the mean difference is đ?‘†đ??¸(đ?‘‘Ě… ) =

đ?‘ đ?‘‘ √đ?‘›

Making a confidence interval for matched pairs is exactly the same steps as for the one-sample tinterval. Effect Size When we examine paired differences and we fail to reject the null (there isn’t a difference) it could be that there isn’t any difference or the difference was too small to be significant. A confidence interval is a good way to get a sense for the size of the effect we’re trying to understand. To assess the size of the effect, we need a confidence interval. Wait, What’s Independent? Pairs have to be independent of each other. No assumptions of individuals – only pairs. The only dependence in paired data is the pairing itself. Let’s Try It! Paired T-Test Researchers collected IQ data on a SRS of parents of 36 children identified as “gifted�. Below are the results and histogram of the IQ differences. Run a test to see if mothers and fathers of gifted children have different average IQs.

The parents were chosen randomly, so the differences will be independent. There are more than 360 gifted children with parents in the US, and the histogram appears nearly normal (slight left skew, but n = 36 > 30) đ??ť0 : đ?‘‘ = 0 đ??ťđ??´ : đ?‘‘ ≠0 The differences follow a t-distribution with 36-1 = 35 degrees of freedom, centered a 0. đ?‘ đ?‘‘ √đ?‘›

SE = The t-statistic for the observed difference is:

=

7.5 √36

≈ 1.25


đ?‘Ą35 =

đ?‘‘Ě… − ∆0 đ?‘†đ??¸(đ?‘‘Ě… )

=

3.4 1.25

≈ 2.72

Since this is a 2-sided test, we need to find the P-values on both sides of the extremes.

So, we get a P-Value of 0.0101. At the 5% level, we reject the null hypothesis. It does appear that there is a difference in the average IQs of parents of gifted children. Indeed, mothers may have higher IQ scores.

Dude, I already know Hypothesis Tests and CI, tell me about other things OK – here’s some random bits of info that didn’t make it to the topics of the study guide: Critical Values for Hypothesis Tests Critical values can be used as shortcut for hypothesis tests. Any z or t score larger in magnitude than a particular critical value has to be less likely, so it will have a P-value smaller than the corresponding alpha. This is fine if we simply want a yes/no answer but doesn’t give us much information about the hypothesis since we don’t have the P-values to think about. Table for Traditional Critical Values from the Normal Model � 0.05 0.01 0.001

1-Sided 1.645 2.33 3.09

2-sided 1.96 2.576 3.29

1.96 or 2: 2 is roughly the critical value for testing a hypothesis against a two-sided alternative at alpha=0.05 More exact CV for z is 1.96 but for t, the value is 2.00 at 60 degrees of freedom. As the number of degrees of freedom increases, the critical value approaches 1.96. Confidence Interval for Small Samples We use the success/failure condition to see if our sample is large enough to approximate the distribution to be normal. We can approximate binomial distributions to normal distributions (which makes calculating so much easier) so long as the sample is large enough. How large should n be? As long as:


đ?‘›đ?‘? ≼ 10 đ?‘Žđ?‘›đ?‘‘ đ?‘›(1 − đ?‘?) ≼ 10 As long as the successes and failures are greater than 10, the sample is assumed large enough to approximate normality. BUT, what if this assumption fails? We can still make a 95% confidence interval by simply adjusting this by adding four fake observations: two to successes and two to failures. Probability/Proportion of Success/Failure (đ?‘?Ě‚ ) =

đ?‘›đ?‘˘đ?‘šđ?‘?đ?‘’đ?‘&#x; đ?‘œđ?‘“ đ?‘ đ?‘˘đ?‘?đ?‘?đ?‘’đ?‘ đ?‘ đ?‘’đ?‘ /đ?‘“đ?‘Žđ?‘–đ?‘™đ?‘˘đ?‘&#x;đ?‘’đ?‘ (đ?‘Ś) , đ?‘ đ?‘Žđ?‘šđ?‘?đ?‘™đ?‘’ đ?‘ đ?‘–đ?‘§đ?‘’ (đ?‘›)

we then adjust the

proportion to: đ?‘?Ěƒ =

đ?‘Ś+2 đ?‘›+4

We also denote: đ?‘›Ěƒ = đ?‘› + 4 đ?‘?Ěƒ(1−đ?‘?Ěƒ) đ?‘›Ěƒ

Adjusted interval: đ?‘?Ěƒ Âą đ?‘§ ∗ √

This adjusted interval works better for proportions near 0 or 1, also using this adjustment means that we don’t need to check the Success/Failure condition. This method is called the “plus-fourâ€? interval. You can use it safely for any sample size. Confidence Intervals and Hypothesis Tests Any value outside the confidence interval would make a null hypothesis that we would reject, we feel more strongly about values far outside our interval. Confidence intervals and hypothesis tests have the same assumptions/conditions. You can approximate a hypothesis test by examining the confidence interval – see if the null hypothesis value is consistent with a confidence interval for the parameter at the corresponding confidence level. Confidence intervals are naturally two-sided. Generally, a confidence interval with confidence level of C% corresponds to a two-sided hypothesis test with an alpha level of 100 – C% Confidence interval with a confidence level of C% corresponds to a one-sided hypothesis test with an alpha level of ½(100 – C)% More on Errors No hypothesis is 100% certain because test is based on probabilities, there is always a chance of drawing an incorrect conclusion. Type 1 Error: When the null is true and you reject it. Probability of making a type I error is alpha. An alpha of 0.05 means that you are willing to accept a 5% chance that you are wrong when you reject the null hypothesis. If you want to lower the risk, you have to lower the value of alpha – BUT doing this means you will be less likely to detect a true difference if one really exists since its just harder to reject the null when alpha is really small. So when alpha decreases, the probability of making a Type I error decreases and probability of making a Type II error increases. Type 2 Error: When null is false and you fail to reject it. Probability of making a type 2 error is BETA. You can decrease your risk of committing type II error by ensuring your test has enough power (1- Beta). You can do this by ensuring your sample size is large enough to detect a difference when one truly exists.


Type I and Type II are inversely related (when one goes up the other goes down). Decision Fail To Reject

Reject

Null Hypothesis True False Correct Decision (probability = Type II Error: Fail to reject the 1 – �) null when it is false (probability = �) Type I error: Rejecting the null Correct Decision (probability = 1 when it is true (probability = �) – �)

How to increase the power of a test: 1. Use a larger sample: using a larger sample provides more info about population thus increases the power. 2. Decrease your standard deviation by improving your process, when SD decrases power increases. 3. Use a higher significance level, this increases probability that you reject the null, but this also increases the probability of a Type I error. 4. Use a directional hypothesis (one-tail tests), directional hypothesis has more power to detect differences BUT can’t detect differences in the opposite direction. Good Luck Guys! The two tests that are missing are inferences for regression and Chi-Squared test. Please cover those topics on your own.

Special thanks to Professor Quarfoot for letting me borrow material from his slides.


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.