6 minute read
INFERENTIAL STATISTICS PARAMETRIC TESTS
from Statistics Manual
by IFMSA-Egypt
Inferential Statistics is not a straight forward task. Because we can never know for sure the true underlying state of the population (i.e. reality), we have to always make some assumptions.
These assumptions are not just lucky guesses; but very specific and calculated steps that help us move forward with our data analysis in the most efficient way (inferential statistics is all about looking beyond the data).
Advertisement
One of the main uses of these assumptions is to help us differentiate between two very different procedures in inferential statistics: Parametric and Nonparametric tests.
Parametric Tests
Parametric Tests are the conservative version of inferential statistics, in which researchers assume numerous strict assumptions about the variables.
Some of the common assumptions for the parametric tests include: Normality, Randomness, Absence of Outliers, Homogeneity of Variances and Independence of Observations.
We will only talk about the most important three assumptions due to the scope of this introductory manual.
Parametric Tests Assumptions
Normality
The most important distinction that helps you with choosing the type of test you will use is whether the results for the variable you are measuring are normally distributed.
This can be done either Graphically, by plotting the data and looking at the graph; or Analytically, using one of the common tests (e.g. Shapiro–Wilk Test or Kolmogorov–Smirnov Test).
Absence of Outliers
After checking for normality, researchers review their dataset for the presence of Outliers. An Outlier is a datapoint that just does not fit with the rest of the dataset (significantly away from the rest of the observations).
These datapoints primarily exist for one of two reasons:
High Variability, which ultimately can not be excluded from the dataset during the analysis.
Experimental Error, which can sometimes be a reason to exclude datapoints from the dataset.
Homogeneity of Variances
This fancy statistical assumption simply means that if you are comparing two variables (e.g. two groups), they have to have equal variances (recall that variance is SD2); because otherwise, the study will produce bias, as the results of the analysis should apply to the whole population, both samples should represent the population (variance ensures this is true).
Independent/Two-sample T-test
Used to compare differences between two variables coming from two Separate populations (e.g. one that took caffeine and another one that took a placebo).
Independent T-test Hypotheses
1. H0: The two samples come from the same population (same mean). This would usually mean that the two populations sampled had no true difference in reality.
2. H1: The two samples come from two different populations (different means).
This may mean that the measured effect size reflects a true difference in reality (the caffeine does differ from the placebo.)
Independent T-test Example
A study of the effect of caffeine on muscle metabolism used eighteen male volunteers each of whom underwent arm exercise tests. Half of the participants were randomly selected to take a capsule containing pure caffeine one hour before the test.
The other half received a placebo capsule at the same time. During each exercise, the subjects’ ratio of CO2 produced to O2 consumed was measured (RER), an indicator of whether energy is being obtained from carbohydrates or fats.
The question of interest to the experimenters was whether, on average, caffeine changes RER. The two populations being compared are “men who have not taken caffeine” and “men who have taken caffeine”.
If caffeine has no effect on RER the two sets of data can be regarded as having come from the same population.
CI and Significance of the Results
1. 95% CI for the Effect Size: (13.1, -0.4)
2. Sig. = 0.063
Interpretation:
1. No Statistical Significance for the Effect Size (p-value/sig. > 0.05).
2. The 95% Confidence Interval passes through zero, The Value of No Difference: The Null Value; therefore, the null hypothesis can’t be rejected and it may be concluded that caffeine has no effect on muscle metabolism.
Dependent/Paired T-test
Used to compare two related means (mostly coming from a Repeated Measures design). In other words, it is used to compare differences between two variables coming from One Population, but under Differing Conditions (e.g. two observations for the same participants, one before and one after an intervention).
Dependent T-test Hypotheses
1. H0: The Effect Size for the “before” vs “after” measurements is equal to zero. This would usually mean that the two observations had no true difference in reality.
2. H1: The Effect Size for the “before” vs “after” measurements is not equal to zero.
This may mean that the measured effect size reflects a true difference in reality, the “before intervention” mean does differ from the “after intervention” mean.
Dependent T-test Example
An experiment in which a new exercise program was tested to know whether it increases the hemoglobin contents of the participants.
Fourteen participants took part in the three-week exercise program. Their hemoglobin content was measured before and after the program.
Here, we need to test the null hypothesis: The effect size for hemoglobin scores before and after the exercise program is zero; against the alternative hypothesis: The program is effective.
Significance of the Results: Sig. = 0.110
Interpretation:
No Statistical Significance for the Effect Size (p-value/sig. > 0.05). Therefore, the null hypothesis can’t be rejected and it may be concluded that the exercise program is not effective in improving hemoglobin content.
One-way Analysis Of Variance (ANOVA)
An extension of the t-test, used to determine whether there are any statistically significant differences between the means of three or more independent groups.
ANOVA Hypotheses
1. H0: All groups’ means in the population are equal.
2. H1: At least one group’s mean in the population differs from the others.
ANOVA Example
A study was conducted to assess if there are differences in three means of IQ scores of students from three groups of undergraduate students majoring in different disciplines: Physics, Maths, and Chemistry. Each group included 15 students.
Relevant ANOVA Result: P-value = 0.000
Interpretation:
Statistical Significance (p-value/sig. < 0.05). Therefore, the null hypothesis of the equality of means can be rejected: at least one of the groups is different. To figure out the group causing this difference in the IQ scores, further statistical tests (i.e. post-hoc analysis) might be performed.
Association
An Association is any relationship between two continuous variables. By looking at association, graphically or analytically, we can define the strength of the relationship (using correlation coefficients); the variables changing together; and the direction of the statistical relationship between those variables (e.g. direct or inverse).
You might have heard or read the phrase: association does not imply causation. It is possible to understand Causality as the interaction between two events where one of them is a consequence of another.
Consequently, the simple statistical relationship does NOT necessarily indicate causation (remember the Confounding Factors.)
The words Association and Correlation are usually used interchangeably, but we need to pinpoint the difference to give a relevant mathematical context for the rest of this manual.
Correlation is the type of association that exists between linear variables.
Pearson’s Correlation Coefficient (r)
It ranges from +1 to -1, with positive values suggesting a positive relationship (direct) and negative values suggesting a negative relationship (inverse).
An (r) of zero suggests that there is no relationship and that the two variables are independent from each other.
The closest the r value is to one of the extremes (+1 or -1), the stronger the relationship and linearity in that direction.
Pearson’s Correlation Coefficient Example
Researchers collected data from 20 children (n =20) about their Age, IQ, and Short-term Memory (STM) span. After performing Pearson’s Correlation Coefficient Test, the output was as follows (colors are for your convenience):
Interpretation:
Any p-value less than 0.05 is considered statistically significant for the correlation between the two variables in the row and the column leading to it. Therefore, a statistically significant strong positive linear correlation is observed between Age and STM Span (r = 0.723, p = 0.000).
Regression
Used to determine the strength and quantify the relationship between one dependent variable (x) and one or more other independent variables (y). You can relate a set of x and y values using many types of regression equations.
The most used regression equation is the linear regression model, where you draw single or multiple lines that relate your x and y variables. As this is an introductory manual, we will only discuss the simple linear regression model.
We use the simple linear regression model when we investigate the relationship between a dependent variable and only one independent variable. This model is a very simple equation describing the best single straight line passing through the data observed: “y = a + b(x)”
1. y: the dependent variable
2. a: the constant/y-intercept
3. b: the slope of the line
4. x: the independent variable
Simple Linear Regression Example
A dataset obtained from a sample of girls was investigated to determine the relationship between their Age (in whole years) and their Forced Vital Capacity (FVC, in Liters). The calculations were carried out using a statistical algorithm and the final equation was defined as “FVC = 0.305 + 0.193(Age)”.
This means that as age increases by 1 year, the FVC increases by 0.193 liters. We will leave the 0.305 intercept for a more advanced setting.