Virtual University of Pakistan Lecture No. 38 of the course on Statistics and Probability by Miss Saleha Naghmi Habibullah
IN THE LAST LECTURE, YOU LEARNT •Hypothesis-Testing (continuation of basic concepts) •Hypothesis-Testing regarding µ (based on Z-statistic)
TOPICS FOR TODAY
• Hypothesis-Testing regarding µ1 - µ2 (based on Z-statistic)
• Hypothesis Testing regarding p (based on Z-statistic)
In the last lecture, we discussed the basic concepts involved in hypothesistesting. Also, we applied this concept to a few examples regarding the testing of the population mean Âľ. These examples pointed to the six main steps involved in any hypothesis-testing procedure:
The above example points to the general procedure for testing hypotheses:
General Procedure for Testing Hypotheses: Testing a hypothesis about a population parameter involves the following six steps: i) State your problem and formulate an appropriate null hypothesis H0 with an alternative hypothesis H1, which is to be accepted when H0 is rejected.
ii) Decide upon a significance level of the test, Îą, which is the probability of rejecting the Null Hypothesis if it is true. iii) Choose a test-statistic such as the normal distribution, the t-distribution, etc. to test H0.
iv) Determine the rejection or critical region in such a way that the probability of rejecting the null hypothesis H0, if it is true, is equal to the significance level, Îą. The location of the critical region depends upon the form of H1 (i.e. whether we are carrying out a one-tailed test or a twotailed test). The critical value(s) will separate the acceptance region from the rejection region.
v) Compute the value of the test-statistic from the sample data in order to decide whether to accept or reject the null hypothesis H0.
vi) Formulate the decision rule (i.e. draw a conclusion) as follows: a) Reject the null hypothesis H0, if the computed value of the test statistic falls in the rejection region. b) Accept otherwise.
the
null
hypothesis
H0,
Important Note: It is very important to realize that when applying a hypothesis-testing procedure of the type explained above, we always begin by assuming that the null hypothesis is true.
Important Note: As s2 is an unbiased estimator of Ďƒ2 whereas S2 is a biased estimator, hence we would like to use this estimator whenever Ďƒ2 is unknown. However, when n is large, s2 is approximately equal to S2, as explained below:
We know that ∑ (x − x )2 ( ) ( ) 2 2 = ⇒ − = − ∑ s x x n 1 s2 n −1 whereas ∑ (x − x )2 ( ) 2 = ⇒ − ∑ S x x 2 = nS2. n Hence (n − 1) 1 (n − 1)s 2 = nS2⇒ S2 = s 2 = 1 − s 2 n n
1
→
0 .
n Now, as n → ∞, Hence, if n is large, ~ 2 − S
2 s .
Hence, in case of a large sample drawn from a population with unknown variance Ďƒ2, we may replace Ďƒ2 by S2.
We now consider the case when we are interested in testing the equality of two population means. We illustrate this situation with the help of the following example:
EXAMPLE A survey conducted by a marketresearch organization five years ago showed that the estimated hourly wage for temporary computer analysts was essentially the same as the hourly wage for registered nurses.
This year, a random sample of 32 temporary computer analysts from across the country is taken. The analysts are contacted by telephone and asked what rates they are currently able to obtain in the market-place.
A similar random registered nurses is taken.
sample
of
34
The resulting wage figures are listed in the following table:
Computer Analysts $ 24.10 23.75 24.25 22.00 23.50 22.80 24.00 23.85 24.20 22.90 23.20 23.55
$25.00 22.70 21.30 22.55 23.25 22.10 24.25 23.50 22.75 23.80
$24.25 21.75 22.00 18.00 23.50 22.70 21.50 23.80 25.60 24.10
Registered Nurses $20.75 23.80 22.00 21.85 24.16 21.10 23.75 22.50 25.00 22.70 23.25 21.90
$23.30 24.00 21.75 21.50 20.40 23.25 19.50 21.75 20.80 20.25 22.45 19.10
$22.75 23.00 21.25 20.00 21.75 20.50 22.60 21.70 20.75 22.50
Conduct a hypothesis test at the 2% level of significance to determine whether the hourly wages of the computer analysts are still the same as those of registered nurses.
SOLUTION Hypothesis Testing Procedure: Step-1: Formulation of the Null and Alternative Hypotheses: H0 : µ1 – µ2 = 0 HA : µ1 – µ2 ≠ 0 (Two-tailed test)
Step-2: Level of Significance: α = 0.02 Step-3: Test Statistic: ( ) ( ) − − µ − µ X X 2 1 2 Z= 1 σ2 σ2 1+ 2 n n 1 2
Step-4: Calculations: The sample size, sample mean and sample standard deviation for each of the two samples are given below:
Computer Analysts: n1 = 32 鵃出 1 S12 =
=
$23.14
1.854
Registered Nurses: n2 = 34 鵃出2 =
$21.99
S22 =
1.845
Since the sample sizes are larger than 30, hence, the unknown population variances σ12 and σ22 can be replaced by S12 and S22. Hence, our formula becomes: ( ) ( ) X −X − µ −µ 1 2 1 2 Z = S2 S2 1 + 2 n n 1 2
Hence, the computed value of Z comes out to be : Z=
(23 .14 − 21 .99 ) − (0 ) 1 .854 1 .845 + 32 34
1 .15 = = 3 .43 0 .335
Step-5: Critical Region: As the level of significance is 2%, and this is a two-tailed test, hence, we have the following situation:
α/2 = .01
Z.01 = -2.33
0.49
0.49
0
α/2 = .01
Z.01 = +2.33
Hence, the critical region is given by | Z | > 2.33
Step-6: Conclusion: As the computed value i.e. 3.43 is greater than the tabulated value 2.33, hence, we reject H0.
Z.01 = -2.33
Z=0
Z.01 = +2.33
Z
Calculated Z = 3.43
µ1 − µ 2 = 0
X1 − X 2 = 1.15
X1 − X 2
The researcher can say that there is a significant difference between the average hourly wage of a temporary computer analyst and the average hourly wage of a temporary registered nurse.
The researcher then examines the sample means and uses common sense to conclude that, on the average, temporary computer analyst earn more than temporary registered nurses.
Let us consolidate the above concept by considering another example:
EXAMPLE Suppose that the workers of factory B believe that the average income of the workers of factory A exceeds their average income. A random sample of workers is drawn from each of the two factories, an the two samples yield the following information:
Factory A B
Sample Size 160 220
Mean
Variance
12.80 11.25
64 47
Test the above hypothesis.
SOLUTION Let subscript 1 denote values pertaining to Factory A, and let subscript 2 denote values pertaining to Factory B. Then, we proceed as follows:
Hypothesis-testing Procedure: Step 1: H0 : µ1 < µ2 (or µ1 - µ2 < 0) HA : µ1 > µ2 (or µ1 - µ2 > 0).
Step 2: Level of significance = 5%.
Steps 3 & 4:
Z=
=
x1 − x 2 − 0 2 s1
2
s2 + n1 n 2
1 . 55 0 . 61
=
12.80 − 11.25 = 64 47 + 160 220
1 . 55 0 . 78
= 1 . 99
Step 5:Critical Region: Since it is a right-tailed test, hence the critical region is given by Z > Z0.05 i.e. Z > 1.645
Step 6: Conclusion: Since 1.99 is greater than 1.645, hence H0 should be rejected in favour of HA. The sample evidence has consolidated the belief of the workers of factory B.
Next, we consider the case when we are interested in conducting a test regarding p, the proportion of successes in the population. We illustrate this situation with the help of the following example:
EXAMPLE A sociologist has a hunch that not more than 50% of the children who appear in a particular juvenile court three times or more are orphans.
To test this hypothesis, a sample of 634 such children is taken and it is found that 341 of these children are orphans, (one or both parents dead). Test the above hypothesis using 1% level of significance.
SOLUTION Hypothesis-testing Procedure: Step 1: H0 : p < 0.50 HA : p > 0.50 (one-tailed test) Step 2: Level of significance: Îą = 1%
Step 3:
Test statistic:
X ± 1− n p 0 2 Z = ( − ) n p 1 p 0 0 (where + ½ denotes the continuity correction)
Step 4: Computation: Here np0 = 634 (0.50) = 317 and X = 341 Hence X > np0 so use X - ½ = Z So
341 − 1 − 317 23.5 = 2 ( )( ) 12 .59 634 0.50 0.50
= 1.87
Step 5: Critical region: Since Îą = 0.01, hence the critical region is given by Z > 2.33
Step 6: Conclusion: Since 1.87 < 2.33, hence the computed Z does not fall in the critical region. Hence, we conclude that the sociologistâ&#x20AC;&#x2122;s hunch is acceptable.
IN TODAY’S LECTURE, YOU LEARNT
• Hypothesis-Testing regarding µ1 - µ2 (based on Z-statistic)
• Hypothesis Testing regarding p (based on Z-statistic)
IN THE NEXT LECTURE, YOU WILL LEARN
• Hypothesis Testing Regarding p1-p2 (based on Z-statistic) •The Student’s t-distribution •Confidence Interval for µ based on the tdistribution