Chi-Square Test Dept. of AGB Veterinary College, SHIMOGA
2
The Chi-square test χ is a useful measure of comparing experimentally obtained results with those expected theoretically and based on the hypothesis.
STAT 512 2018-19 AGB VCS RJ
The measure of chi-square enables us to find out the degree of discrepancy between observed frequencies and theoretical frequencies and thus to determine whether the discrepancy so obtained between observed frequencies and theoretical frequencies is due to error of sampling or due to chance.
STAT 512 2018-19 AGB VCS RJ
2
O − E 2 = E
2
O −E O −E 2 2 2 = 1 1 + E E 2 1
2
On − En +………+ On
O = Observed frequency E = Expected frequency STAT 512 2018-19 AGB VCS RJ
2
Application of Chi-Square test: – To test goodness of fit. – To test the independence of attributes or characters. – To test the homogeneity of independent estimates of the population variance. – To test the detection of linkage.
STAT 512 2018-19 AGB VCS RJ
To test goodness of fit: • This is used to test whether there is any significant difference between observed frequency and theoretical frequency distribution such as Binomial, Poisson, and Normal distribution.
2
O − E 2 = E
The calculated value is compared with table value at (n-1) degrees of freedom at selected level of significance. STAT 512 2018-19 AGB VCS RJ
Chi-square test as a test of Independence (association) • This is used to find out whether one or more attributes are associated (related) or not. A B Total
1 a (Observed) c (Observed) a+c
2 Total b a+b (Observed) d c+d (Observed) b+d a+b+c+d=N
STAT 512 2018-19 AGB VCS RJ
Expected frequency =
Row Total Column Total N
N= Grand Total
Direct method:
2
N ad − bc 2 = ( ) a + b a + c b + d c + d
STAT 512 2018-19 AGB VCS RJ
• The calculated chi-square value is compared to table value at (r-1)(c-1) degrees of freedom at selected level of significance. • If calculated value is greater than the table value then the difference between the observed and expected frequencies is significant. • If calculated value is lesser than the table value then the difference between the observed and expected frequencies is not significant. STAT 512 2018-19 AGB VCS RJ
Chi-square test as test of homogeneity • To determine whether two or more samples are drawn from the same population or from different population especially in comparison of ratios of observed and expected in case of Mendelian genetics • Calculated chi-square value is compared at (no. of classess-1) degrees of freedom at selected level of significance. STAT 512 2018-19 AGB VCS RJ
Example: • In an experiment on immunization of goats from anthrax the following results were obtained. Derive your inference on the vaccine. Died of anthrax Inoculated 2(a)
Survived
Total
12(a+b) 12(c+d) a+b+c+d=N=24
Not Inoculated
6(c)
10(b) 6(d)
Total
a+c=8
b+d=16
STAT 512 2018-19 AGB VCS RJ
Null hypothesis: The vaccine is effective in controlling the disease. Observed frequency table
2 6 8
10 6 16
12 12 24
Expected frequency table: Expected frequency=
Row Total ï‚´ Column Total N STAT 512 2018-19 AGB VCS RJ
8ï‚´12 = 4 24
8
12
4
8
12
8
16
24
STAT 512 2018-19 AGB VCS RJ
Computation table for
2
O
E
O-E
(O-E)2
2
4
-2
4
1.0
10
8
2
4
0.5
6
4
2
4
1.0
6
8
-2
4
0.5
O − E 2 = E STAT 512 2018-19 AGB VCS RJ
2
O−E N
2
=3.0
Degrees of freedom: (r-1)(c-1)=(2-1)(2-1)=1 The table value for 1 d.f at 5% level of significance is 3.841. The calculated value is 3.0 which is less than the table value. Hence the null hypothesis is accepted.
STAT 512 2018-19 AGB VCS RJ
Following is the data regarding the number of sheep that lambed normally in a farm and the number of sheep that had difficulty in lambing during various season. Estimate the Chi-square value and find out whether there is significant effect of season on lambing.
Season Summer Monsoon Winter
Normal lambing 5 6 2
Difficulty in lambing 1 2 3
STAT 512 2018-19 AGB VCS RJ
Observed data Season Summer Monsoon Winter Total
Normal lambing 5 6 2 13
Difficulty in lambing 1 2 3 6
STAT 512 2018-19 AGB VCS RJ
Total 6 8 5 19
Expected frequency
Expected frequency=
6 13 E1 = = 4.10 19
66 E4 = = 1.89 19
Row Total Column Total N
8 13 E2 = = 5.47 19
5 13 E3 = = 3.42 19
8 6 E5 = = 2.53 19
STAT 512 2018-19 AGB VCS RJ
5 6 E5 = = 1.58 19
Expected Frequency Table:
Season
Summer Monsoon Winter
Normal lambing 4.1 5.47 3.42
Difficulty in lambing 1.89 2.53 1.58
STAT 512 2018-19 AGB VCS RJ
O
E
O-E
(O − E )2
(O-E)2
N
5
4.1
0.9
0.81
0.043
6
5.47
0.53
0.2809
0.015
2
3.42
1.42
2.01
0.105
1
1.89
0.89
0.7921
0.041
2
2.53
0.53
0.2809
0.014
3
1.58
1.42
2.01
0.105 2
STAT 512 2018-19 AGB VCS RJ
2 ( ) O−E =
E
= 0.323
OBSERVED DATA
TRAIT 1 GROUP 1 GROUP 2 Totals of Columns
TRAIT 2
Totals of Rows 144 200.00 61 200.00 205.00 400.00
56 139 195.00
EXPECTED DATA if there's no difference between the groups
GROUP 1 GROUP 2
TRAIT 1 97.50 97.50
TRAIT 2 102.50 102.50
CHI-SQUARE CALCULATION
GROUP 1 GROUP 2
TRAIT 1 17.6641 17.6641
Chi-squareSum = of B18+B19+C18+C19 = d.f = (rows-1)(columns-1)=1
TRAIT 2 16.8024 16.8024 68.9331 Critical value: 3.84
STAT 512 2018-19 AGB VCS RJ
Definition: The Binomial Distribution is one of the discrete probability distribution. It is used when there are exactly two mutually exclusive outcomes of a trial. These outcomes are appropriately labeled Success and Failure. The Binomial Distribution is used to obtain the probability of observing r successes in n trials, with the probability of success on a single trial denoted by p.
STAT 512 2018-19 AGB VCS RJ
sl no
age 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
Average Correlation
BW in kgs 17 18 17 19 18 17 18 17 18 17 17 17 17 18 18 18 18 18
46 49 47 57 51 52 47 50 45 54 54 45 45 55 54 54 64 44
317 17.61111111 0.348732682
913 50.72222222 0.272634657
height in resp rate heart rate pulse Height cms 169 17 69.33 58.66 169 167 14 70.3 57.5 167 153 15 56 56 153 174 14.617 63 64.3 174 167 17 71 64 167 150 17 73 56 150 155 16.33 66.6 39 155 141 12.33 53 33.33 141 160 11.3 67.3 59 160 163 11.33 81 71.67 163 150 16 71 71 150 167 19 69 49.6 167 165 14 75.3 52.6 165 167 14.2 70.33 52.9 167 158 15 55 42.3 158 165 15.6 62.6 54.6 165 172 13 80 72 172 156 12 75 69 156 2899 264.707 1228.76 1023.46 161.0555556 14.70594444 68.26444444 56.8588889 0.113561422 -0.12930329 0.688007384 0.41096649
STAT 512 2018-19 AGB VCS RJ
sl no
age 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
Average Correlation
BW in kgs 17 18 17 19 18 17 18 17 18 17 17 17 17 18 18 18 18 18
46 49 47 57 51 52 47 50 45 54 54 45 45 55 54 54 64 44
317 17.61111111 0.348732682
913 50.72222222 0.272634657
height in resp rate heart rate pulse Height cms 169 17 69.33 58.66 169 167 14 70.3 57.5 167 153 15 56 56 153 174 14.617 63 64.3 174 167 17 71 64 167 150 17 73 56 150 155 16.33 66.6 39 155 141 12.33 53 33.33 141 160 11.3 67.3 59 160 163 11.33 81 71.67 163 150 16 71 71 150 167 19 69 49.6 167 165 14 75.3 52.6 165 167 14.2 70.33 52.9 167 158 15 55 42.3 158 165 15.6 62.6 54.6 165 172 13 80 72 172 156 12 75 69 156 2899 264.707 1228.76 1023.46 161.0555556 14.70594444 68.26444444 56.8588889 0.113561422 -0.12930329 0.688007384 0.41096649
STAT 512 2018-19 AGB VCS RJ
age 17 17 17 17 17 17 17 17 18 18 18 18 18 18 18 18 18 19
BW in kgs 46 47 52 50 54 54 45 45 49 51 47 45 55 54 54 64 44 57
17
49.125
18
50.5
19
57
STAT 512 2018-19 AGB VCS RJ