Correlation and Regression AGB 111
Correlation • The statistical tool with the help of which the relationship between the two variables studied is called correlation.
Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore
• Uni-variate analysis: Analysis of data when only one variable is involved. • Eg. Dispersion, Central tendency, Skewness, Kurtosis • Bi-variate analysis: It involves two variables which have got relationship exist between them. In biological experiment - to know the strength of relationship or one may like to predict one variable from another related variable. Help in measuring the independence or relationship between bi-variate data Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore
Correlation and causation • High degree of correlation exists due to any one or a combination of the following reasons. 1. By Chance: Due to small number of variablessometimes there may exist a correlation in a sample but the same does not exist in the population. 2. Influence of some external factors on two variables- A high degree of variables may be due to same causes affecting the each variable. Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore
3. Influence of two variables on each other or mutual influence. 4. Influence of one variable upon the other one of the variable is truly independent and therefore acts free from any external forces and influence the other variable which is truly dependent since it reacts in response to the independent variable.
Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore
Mutual relationship could depend on • Mutual dependence- supply and demand • Both are influenced by same external factors – Effect of weather on rice and potato yield. • Pure chance- size of shoe and degree of intelligence- known as spurious or non sense correlation.
Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore
Types of correlation 1. Positive or negative correlation. 2. Simple partial or multiple correlations
3. Linear or non linear correlations.
Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore
Positive or negative correlation. • It depends on the direction in which the variables are moving. • When both the variables move in the same direction it is positive correlation • If they move in the opposite direction it is negative correlation.
Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore
Negative correlation Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore
Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore
Simple , Partial & Multiple correlation • Simple- Only two variables are involved. • Partial or multiple- Relationship of more than two variables. • Multiple correlation- The relationship between one independent variable and two or more independent variables are studied. Eg. Feed intake _ Body weight, Milk yield. Partial correlation: The study of two variables excluding some other variables is also called partial correlation. Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore
Linear correlation X
30
60
90
120 150 180
Y
10
20
30
40
50
70 60 50 40 30
20 10
0 0
50
100
150
200
Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore
60
Non Linear correlation X
30
60
90 120 150 180
Y
10
50
60
20
50
60
70 60 50 40 30
20 10 0 0
50
100
150
Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore
200
Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore
Methods of studying correlation Scatter diagram method Graphical method • Both these are about visualizing relationship. Coefficient of correlation - Measuring the relationship. • Scatter diagram: By plotting the two variables on the graph sheet the relationship can be understood Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore
Scatter Plots
Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore
Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore
• If the points are too much scattered it indicates less or no relationship. • If it is condensed then it indicates some relationship between the two variables.
Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore
Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore
Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore
Correlation Coefficient - Karl Pearson
r
Cov X , Y
X
Y
X X Y Y n 1 r Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore
X
Y
Distinction between linear and nonlinear correlation It is based on the ratio of change between the variables under study. X 1 2 3 4 5 Y 5 7 9 11 13 ďƒ˜ For a unit change in X there is a constant change of 2 in Y. Y = 2X + 3 • The two variables X and Y are linearly related, if there exist a relationship Y = a + bx Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore
Non linear or curvilinear. If there is no constant change in ‘Y’ for every unit change in ‘a’ then it is termed as non linear or curvilinear. Y = a + bx X Y
1 5
2 7
X Y
1 5
2 7
3 9
4 11
5 13
3 4 5 12 15 13 Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore
Linear Non Linear
Depending upon the distribution in the scatter plot •High degree of positive correlation •High degree of negative correlation
•Low degree of negative correlation •Low degree of positive correlation Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore
Standard error It tests its reliability of the observed ‘r’
1 r SE n
2
Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore
Probable error • Probable error (P.E)= 0.6745 X S.E (r) 0.6745- in a normal distribution, 50% of the observations lie in the range of µ±0.6745 r ± P.E indicate the limit within which the population correlation coefficient may be expected to lie. If r< P.E then r is not significant If r > P.E then r is definitely significant Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore
• Sometimes PE may give a wrong conclusion especially when ‘r’ is small. In such case the significance can be tested by student ‘t’ test.
t
r
n2 1 r
2
• Tested for n-2 degrees of freedom Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore
Rank correlation • When a statistical series is not quantitative ‘r’ cannot be calculated by Karl Pearson’s method. Eg., Qualitative trait - Honesty, Beauty, Intelligence, Morality. Edward Spearman has given Rank correlation. Certain ranking is given based on individual character and correlation constant is calculated. Rho ‘ρ’ Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore
Rank correlation
6d 1 2 n n 1 2
Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore
X
Y
d
d2
1 2 3 4 5
3 4 5 2 1
-2 -2 -2 2 4 ∑d=0
4 4 4 4 16 ∑d2= 32
6d 2 1 n n2 1
6 32 1 524 Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore
- 0.6
Coefficient of determination â&#x20AC;˘ It gives the percentage variation in the dependent variable that is accounted by the independent variable. â&#x20AC;˘ It is the ratio of explained variance to the total variance. r2=
If r = 0.8 then r2=0.64 which indicates that 64% of variation in the dependent variable is due to change in the independent variable. Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore
Regression • Regression coefficient gives the amount of change in dependent variable for every unit change in independent variable. • It ranges from -∞ to +∞ • It has the same unit as that of the dependent variable.
Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore
Regression equation bxy
bxy
xy
x y
2y
n
X is dependent and Y is independent
byx
b yx
xy
x y
2x
Y is dependent and X is independent Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore
n
Regression equation
a Y bX if
a=2.5 and b= 0.46
Y = 2.5 + 0.46X Then the regression equation may be used to estimate the value of Y where a value of X is known Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore
Correlation
Regression
Correlation means relationship between two variables
Regression coefficient gives the amount of change in dependent variable for every unit change in independent variable.
Measures- direction and degree of relationship
Establishes functional relationship and by using this to predict dependent variable for any given value of independent variable
Need not imply cause and effect
Imply cause and effect relationship.
Correlation coefficient is a relative measure and ranges between -1 and +1
Regression coefficient are absolute measures- ranges from -â&#x2C6;&#x17E; to +â&#x2C6;&#x17E;
Can be non sense correlation
No non sense regression
Limited application
Wider application
Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore
r byx S x SY
r bxy S y S x b yx r b xy r
S
S
x
y
Sx SY
Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore