• Spearman’s yield a correlation coefficient between two ordinal, or ranked variables. • The formula is 6∑ D 2 rs = 1 −
n(n 2 −1)
where D is the difference between paired ranks. The number “6” is a constant.
Pharmaceutical Biostatistics: Correlation & Regression
• For each case that is observed, the rank of the case for each of the variables X and Y is determined by ordering values from low to high, or from high to low. • Hypothesis test for the Spearman’s correlation coefficient is the same as the Pearson’s r.
Pharmaceutical Biostatistics: Correlation & Regression
Example: Ten countries were randomly selected from the list of 129 countries. The countries are compared with respect to their gross national income per capita (GNP), an indicator of the average income level of each country. Researcher have found that people in countries with higher income levels tend to live longer than do people in countries with lower income levels. In addition, birth rates tend to be higher in countries with lower incomes, with birth rates declining as countries become better off economically. The following table gives the mean life expectancy at birth (LE) for the 10 countries, and the crude birth rate (CBR) for each country. Pharmaceutical Biostatistics: Correlation & Regression
Example: Country
GNP
LE
CBR
Algeria
2360
65
34
India
340
59
31
Mongolia
780
62
36
El Salvador
940
64
36
Equador
1120
66
32
Malaysia
1940
74
32
Ireland
7750
74
18
Argentina
2520
71
21
France
16090
76
14
240
47
48
Sierra Leone
Pharmaceutical Biostatistics: Correlation & Regression
Question:
•  Use this data to determine the Spearman rank correlation of GNP with (i) life expectancy, and (ii) the birth rate.
Pharmaceutical Biostatistics: Correlation & Regression
Solution Steps: 1. Determine the ranking for each country with respect to each of the social indicators. Rank them separately. i. GNP is ranked from high to low, so that France is ranked first. ii. Rank countries with respect to their level of LE. France is ranked first. Where cases are tied, the ranks which would otherwise occur are averaged. (Malaysia and Ireland) – ((2+3)/2=2.5). iii. Birth rates (CBR) is ranked from high to low, in order to be consistent. Sierra Leone ranks first. 2. Calculate and square the difference.
RANK Country
GNP
LE
CBR
GNP
LE
CBR
Algeria
2360
65
34
4
6
4
India
340
59
31
9
9
7
Mongolia
780
62
36
8
8
2.5
El Salvador
940
64
36
7
7
2.5
Equador
1120
66
32
6
5
5.5
Malaysia
1940
74
32
5
2.5
5.5
Ireland
7750
74
18
2
2.5
9
Argentina
2520
71
21
3
4
8
France
16090
76
14
1
1
10
240
47
48
10
10
1
Sierra Leone
Pharmaceutical Biostatistics: Correlation & Regression
Steps: 2.  Calculate and square the difference.
Pharmaceutical Biostatistics: Correlation & Regression
Rank
Difference
Country
GNP
LE
Di
Di2
Algeria
4
6
-2
4
India
9
9
0
0
Mongolia
8
8
0
0
El Salvador
7
7
0
0
Equador
6
5
1
1
Malaysia
5
2.5
2.5
6.25
Ireland
2
2.5
-0.5
0.25
Argentina
3
4
-1
1
France
1
1
0
0
Sierra Leone
10
10
0
0
0
12.5
TOTAL
Pharmaceutical Biostatistics: Correlation & Regression
Solution Steps: 2 3. The sum of the final column is ∑ Di = 12.50 2 Using this formula ∑D 6 rs = 1 − n(n 2 −1) The value of Spearman rank correlation coefficient is
rs = 1 − 6 ×12.50 10 × 102 −1
(
)
rs = 0.924 Pharmaceutical Biostatistics: Correlation & Regression
Solution
•  As for the conclusion, the correlation between the gross national product per capita (GNP) and life expectancy rate (LE) is represented by 0.924 which indicates a very strong positive correlation between the variables observed.
Pharmaceutical Biostatistics: Correlation & Regression
Exercise:
•  Proceed with the second part of the question, find the relationship between the rankings of GNP and the crude birth rate (CBR).
Pharmaceutical Biostatistics: Correlation & Regression
Rank Country Exercise: Algeria
Difference
GNP
CBR
Di
Di2
4
4
0
0
India
9
7
2
4
Mongolia
8
2.5
5.5
30.25
El Salvador
7
2.5
4.5
20.25
Equador
6
5.5
0.5
0.25
Malaysia
5
5.5
-0.5
0.25
Ireland
2
9
-7
49
Argentina
3
8
-5
25
France
1
10
-9
81
Sierra Leone
10
1
9
81
0
291.00
TOTAL
Pharmaceutical Biostatistics: Correlation & Regression
Solution
The sum of the final column is ∑ Di2 = 291.00 2 Using this formula ∑D 6 rs = 1 − n(n 2 −1) The value of Spearman rank correlation coefficient is 6 × 291 rs = 1 −
10 × 102 −1
(
)
rs = −0.764 Pharmaceutical Biostatistics: Correlation & Regression
Solution
•  The relationship between the gross national product per capita (GNP) and crude birth rate (CBR) is represented by -0.764 which indicates a strong negative correlation between the variables observed.
Pharmaceutical Biostatistics: Correlation & Regression